Comments (5)
Additionally to clarify:
Actual field autonomous_system_organization in GeoLite2-ASN-Blocks-IPv4.csv and GeoLite2-ASN-Blocks-IPv6.csv has maximum 95 characters including double quote (") if any. All characters are ASCII, it is safe to assume that the string can be stored without truncation in an array of 96 bytes char asfield[96]. I checked this by converting this field to ASCII with iconv without any error, as well as from maxmind and RIPE documentation.
As per maxmind csv files are encoded UTF-8:
https://dev.maxmind.com/geoip/docs/databases/city-and-country?lang=en
As per RIPE org-name is ASCII only, see
https://apps.db.ripe.net/docs/20.Appendices/01-Appendix-A--Syntax-of-Object-Attributes.html
for org-name:
RIPE do not impose a limit of the length of this field, but, apparently, maxmind truncate it to 95 characters or 93 characters if it needs to enclose it in double quotes, e.g. 62.3.160.0 org-name ( https://ipinfo.io/AS9112 ) "Institute of Bioorganic Chemistry Polish Academy of Science, Poznan Supercomputing and Networking Center" is truncated by maxmind to "Institute of Bioorganic Chemistry Polish Academy of Science, Poznan Supercomputing and Networ" which is 95 characters including double quote "
from nfdump.
My version of solution, which works for me for a couple of weeks is to replace in maxmind.h the "char orgName[64];" in struct asV4Node_s and struct asV4Node_s with
#define orgNameLength 96
char orgName[orgNameLength];"
then create a function csvnsep which, in the case the field starts with ", than looks for the terminated combination of double quote followed by comma ",
This function can replace the the call of strsep in functions loadASV4tree and loadASV6tree in maxmind.c like this:
while ((field = strsep(&l, ",")) != NULL) {
->
while ((field = csvnsep(&l, orgNameLength-1, ',', '"')) != NULL) {
and
case 2: // org name
strncpy(asV4Node.orgName, field, 64);
asV4Node.orgName[63] = '\0';
->
case 2: // org name
/*VRO changed it to properly process too long strings and strings with comma and quotes " */
strncpy(asV6Node.orgName, field, orgNameLength);
asV6Node.orgName[orgNameLength-1] = '\0';
The function csvnsep, besides of properly identifying the end of quoted string also make have the parameter of the maximum length to make sure that '\0' is at the maximum length and that the string is always terminated with double quote if it starts with double quote.
The function is bellow and no limits to use (or not to use) and I am not doing professional programming for many years.
char *csvnsep (char *stringp, const size_t max_field_length, const char delim_char, const char quote_char)
{
/ max_field_length truncate the field if exceeds this value, if 0 then no check */
char *begin, *end, *end_quotted;
char delim[2];
char quoted_field_end[3];
delim[0] = delim_char;
delim[1] = '\0';
quoted_field_end[0] = quote_char;
quoted_field_end[1] = delim_char;
quoted_field_end[2] = '\0';
begin = *stringp;
if (begin == NULL)
return NULL;
if(begin != quote_char) {
/ Go the usual strsep way as in glibc /
/ Find the end of the token. */
end = begin + strcspn (begin, delim);
if (end)
{
/ Terminate the token and set *STRINGP past NUL character. */
*end++ = '\0';
stringp = end;
}
else
/ No more delimiters; this is the last token. */
*stringp = NULL;
/* Check whether the length exceeds the max_field_length */
if(max_field_length > 0 && strlen(begin) > max_field_length)
begin[max_field_length] = '\0';
return begin;
}
else {
/* well the field begin with a quote char /
if((end_quotted = strstr(begin+1,quoted_field_end)) != NULL) {
/ begin+1 for unlikely case where field starts with ", */
if(end_quotted[2] == '\0')
*stringp = NULL;
else
stringp = end_quotted + 2;
/ replace comma (separator) with \0 /
end_quotted[1] = '\0';
}
else {
/ a regular end of field not found, either it is the last field or is a mistake, will consider the field until the end of the line */
stringp = NULL;
}
/ Check whether the length exceeds the max_field_length /
if(max_field_length > 0 && strlen(begin) > max_field_length)
begin[max_field_length] = '\0';
/ Make sure that the field ends with quote_char */
if(strlen(begin)>1)
begin[strlen(begin)-1] = quote_char;
return begin;
from nfdump.
Thanks for the report. I will use a slightly different approach to fix this. In the end it should work with length 96
from nfdump.
Fixed in master repo.
from nfdump.
Great. Many thanks.
It works as expected, no differences between autonomous_system_organization name produced by the geolookup and the original from maxmind.
Please, consider that in the future maxmind may add new data fields to the right of the existing fields as per they site:
https://dev.maxmind.com/geoip/docs/databases/asn?lang=en
from nfdump.
Related Issues (20)
- NEL Port Block Allocation / Deallocation Events HOT 1
- Is it possible to know if a flow contained fragmented traffic? HOT 6
- nfdump current (1.7.3) has a bug exporting NSEL (cisco ASA) fw events HOT 3
- when daemonizing, requesting to set uid and gid to some user AND writing PIDfile -> permission denied encountered HOT 4
- feature: it will be very cool if nfcapd switch '-n' allow specifying port to listen to. not globally single '-p' but per-configured exporter HOT 3
- sfcapd -T Extensions 1.6.x missing in 1.7.x HOT 2
- nfprofile: Skip unknown record type 13 (after upgrrading from 1.6.20 to 1.7.3) HOT 8
- Sfcapd not processing netflow... HOT 2
- Include dependencies? HOT 4
- sfcapd not working properly after last commits HOT 4
- GCC14 build failure HOT 9
- nfdump: Skip unknown record type 9 HOT 6
- Troubleshooting NetFlow Data Collection and Router Address Display HOT 4
- Support for NetFlow version 10 HOT 2
- Decreased nfdump performance after upgrading from 1.6.17 HOT 2
- sfcapd not acknowledging -W <Worker> HOT 3
- IPFIX (V10) support for dot1q VLAN IDs HOT 23
- Typo in IPFIX fields HOT 1
- nfdump default workers calculation on low-cpu systems HOT 1
- AS Information HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nfdump.