File type identification fixes

Description

I have some changes nearly queued up for 2.4 release in the repository (topic/seth/more-file-type-ident-fixes) in the but a bit more work needs to be done.

There may be one more breaking change to the files api coming in this branch too. Jon and I discussed some options and I think that creating a new event named file_sniff in place of the file_mime_type event makes sense. We can put the mime type and more "sniff" originated data in a record on that event so that we can extend it cleanly (and without breaking APIs) in the future. I think it will look something like this:

```
type fa_sniff: record {

  1.  

    1. Depth sniffed.
      depth: count &default=0;

    2. Sniffed mime type if one was discovered.
      mime_type: string &optional;
      };

event file_sniff(f: fa_file, sniff: fa_sniff)
{
if ( sniff?$mime_type )
{
print sniff$mime_type;
}
}
```

One other thing this branch will address is a performance degradation from certain file signatures interacting with each other poorly.

Environment

None

Activity

Show:
Robin Sommer
April 10, 2015, 6:23 PM

Jon to propose new names for fa_* types.

Jon Siwek
April 10, 2015, 9:36 PM

Seth, topic/jsiwek/bit-1368 has the changes to the mime type detection script API that you can merge in to your branch for finalization when you're ready. For the naming, I went with:

Seth Hall
April 20, 2015, 4:10 PM

Thanks Jon! I reverted back to the naming I was using (although I'm already taking some flak for it).

My topic/seth/more-file-type-ident-fixes is ready for merging. There are branches of the same name in bro-testing and bro-testing-private as well.

Merging this branch also merges the contents of Jon's topic/jsiwek/bit-1368 branch.

Robin Sommer
April 20, 2015, 8:32 PM

I'm seeing significant performance improvements after this merge, like 4-7% on the external tests (in a debug mode compile)

Seth Hall
April 20, 2015, 8:51 PM

Yay! That's to be expected. This was from fixing the signatures that were interacting poorly with each other. We don't have un-ending DFA state construction anymore.

Assignee

Robin Sommer

Reporter

Seth Hall

Labels

None

External issue ID

None

Components

Fix versions

Affects versions

Priority

Normal
Configure