Add new string functions

Description

Bro appears to have a rather limited list of built-in string functions. This lack of functionality makes it more difficult to write high performance Bro scripts that require large amounts of string processing.

Using Python as a model (https://docs.python.org/2/library/stdtypes.html#string-methods), I think Bro would benefit from having the following additional string functions:

  • str.count - Return the number of non-overlapping occurrences of substring sub in the range

  • str.find - Return the lowest index in the string where substring sub is found within the slice s[start:end].

  • str.lstrip - Return a copy of the string with leading characters removed. The chars argument is a string specifying the set of characters to be removed.

  • str.replace - Return a copy of the string with all occurrences of substring old replaced by new

  • str.rstrip - Return a copy of the string with trailing characters removed. The chars argument is a string specifying the set of characters to be removed

  • str.strip - Return a copy of the string with the leading and trailing characters removed. The chars argument is a string specifying the set of characters to be removed

Additionally, I think a find_all implementation that returned all indexes of the match would be helpful, as is described here: http://stackoverflow.com/questions/4664850/find-all-occurrences-of-a-substring-in-python .

I am willing to put forth the effort in writing these functions if the resulting code (assuming it meets all quality requirements) would be merged in.

Environment

None

Activity

Show:
Moshe Kaplan
November 16, 2016, 6:04 AM
Edited

I had missed the existence of `sub`, so it appears str.replace is not needed:

As an additional entry:

  • str.rfind - Return the highest index in the string where substring sub is found

Current list:

  • str.count - Return the number of non-overlapping occurrences of substring sub in the range

  • str.find - Return the lowest index in the string where substring sub is found within the slice s[start:end].

  • str.lstrip - Return a copy of the string with leading characters removed. The chars argument is a string specifying the set of characters to be removed.

  • str.rfind - Return the highest index in the string where substring sub is found

  • str.rstrip - Return a copy of the string with trailing characters removed. The chars argument is a string specifying the set of characters to be removed

  • str.strip - Return a copy of the string with the leading and trailing characters removed. The chars argument is a string specifying the set of characters to be removed

Seth Hall
February 20, 2017, 4:54 PM

We'd merge these in if you implemented them. Are you still planning on doing so? I have a couple of notes too...

  • `count` can be done with the built in `find_all` function

  • `find` can currently be implemented with the built in `strstr` function.

  • `lstrip`, `rstrip`, can be implement with the `sub` function.

  • `strip` is already implemented.

What might make more sense is to implement these scripts as a Bro package. You aren't dependent on a Bro release that way and you can distribute directly to users.

You can find information about the Bro package manager here: http://bro-package-manager.readthedocs.io/en/stable/

Moshe Kaplan
February 22, 2017, 9:23 AM

I unfortunately am no longer available to implement this functionality.

Jon Siwek
September 21, 2018, 2:17 AM
Won't Do

Assignee

Unassigned

Reporter

Moshe Kaplan

Labels

None

External issue ID

None

Components

Affects versions

Priority

Normal