But recent implementations of gopher serves introduced a new concept: Gophermaps: those are server side files that describe what should be displayed to the client. That also means the server has some parsing work to do. But it should be ok, C comes with maybe one the best string manipulation library ever made with short, descriptive function names (for instance, strpbrk or strrchr) and a sane way to store string length (I really wonder why would anyone use an alternate implementation such as bstring).
So, the problem was quite simple: I wanted to read several pieces of non-tab data separated by tabs and wrote something like that:
sscanf(buffer, "%s\t%s", &first, &second);I expected it would read some non-space data (yes, there's already a problem here if there's space before the tab), then a tab, then some other non-space data. It doesn't.
After reading the man page and doing some tests, I understood a few things about my expression:
%s
first skips leading spaces, before reading non-space data;\t
match any number of spaces, in fact any space in a pattern match any number of spaces.
sscanf(buffer, "%s%s", &first, &second);and is equivalent to this regular expresssion:
"[:space:]*([^:space:]+)[:space:]*([^:space:]+)"not really what I wanted to read...
First I needed to figure how to read non-tab data only, and it happens that this part was simple enough. The square brackets in scanf patterns work somehow the same than in regular expression. So
%[^\t]
will read a sequence of non-tab characters (ho yeah... a \t
between square brackets only match a tab).Next I have to read the "only one tab" part and this part was a lot more fun.
%[\t]
would match a sequence of tabs. To match only a specified number of tabs, you have to use a decimal between the %
and the square bracket. The pattern is now %1[\t]
but there still a problem: each conversion specification (those %...
things) needs to be stored in some output variable and that would be stupid to store a tab each time I need to read one. Scanf provides the *
modifier that tells to discard output of the conversion specification. The final pattern for reading a that is then %*1[\t]
.And the correct version of my scanf is:
sscanf(buffer, "%[^\t]%*1[\t]%[^\t]", &first, &second);For those curious about my gopher server, the mercurial repository is here: http://hg.tuxfamily.org/mercurialroot/gophrier/gophrier/ and there's a mirror here: https://bitbucket.org/guillaume/gophrier
Aucun commentaire:
Enregistrer un commentaire