Documentation
¶
Overview ¶
Package stringsparser provides a flexible string parsing library with support for separators, quoting, escaping, and character set validation.
The parser splits input strings into elements using configurable separator runes, while respecting quoted sections and processing escape sequences. It supports optional validation against predefined or custom character sets, making it suitable for parsing file paths, command-line arguments, CSV-like data, and other structured text formats.
Key features:
- Configurable and multiple separator support (spaces, tabs, commas, or custom runes)
- Single and double quote handling
- Configurable escape sequences (\n, \t, \\, \", \', etc)
- Configurable empty element handling
- Character set validation with predefined charsets for POSIX paths, Windows paths, and alphanumeric text
- Case-sensitive or case-insensitive charset matching
- Detailed error reporting with character positions
- Custom element processing functions
Basic usage:
result, err := stringsparser.ParseStrings("foo bar 'baz qux'")
// Returns: ["foo", "bar", "baz qux"]
With charset validation:
result, err := stringsparser.ParseStrings(
"/home/user/file.txt",
stringsparser.WithWindowsPath(), // validates windows path format and charset
)
Index ¶
- Variables
- func Parse(str string, opts ...Option) ([]string, error)
- type Charset
- type InvalidCharError
- type Option
- func AppendProcessRuneFuncs(fns ...ProcessRuneFunc) Option
- func WithAllowEmpty(allow bool) Option
- func WithCharset(charset *Charset) Option
- func WithProcessFunc(fn ProcessFunc) Option
- func WithProcessRuneFuncs(fns ...ProcessRuneFunc) Option
- func WithSeparators(separators ...rune) Option
- func WithWindowsPath() Option
- type Options
- type ProcessFunc
- type ProcessRuneFunc
Constants ¶
This section is empty.
Variables ¶
var ( ErrUnclosedQuote = errors.New("unclosed quote") ErrDanglingEscape = errors.New("dangling escape") ErrUnexpectedQuote = errors.New("unexpected unescaped quote inside token") ErrNotInTheCharset = errors.New("character is not in the charset") )
var CharsetWindowsPath = NewCharset(windowsPathRunes(), true)
CharsetWindowsPath contains valid characters for Windows file paths. Excludes: <>:"/\|?* and control characters (0-31)
var DefaultOptions = Options{ Separators: DefaultSeparators, AllowEmpty: false, ProcessFunc: nil, ProcessRuneFuncs: DefaultProcessRuneFuncs, }
DefaultOptions contains the default parsing configuration.
var DefaultProcessRuneFuncs = []ProcessRuneFunc{ NewReplaceEscapedFunc(DefaultReplaceEscaped), }
var DefaultReplaceEscaped = map[rune]rune{
'n': '\n',
'r': '\r',
't': '\t',
'0': '\000',
}
var DefaultSeparators = []rune{' ', '\t', ','}
Functions ¶
func Parse ¶
Parse splits an input string into elements using a set of separator runes, with support for quoting, escaping, configurable handling of empty elements, and optional character set validation.
Separators define element boundaries unless they appear inside quotes. Multiple different separator runes may be used at once.
Quoted elements may be enclosed in single (') or double (") quotes. Quotes are not included in the output. Quotes must start at the beginning of an element; encountering a quote after other characters results in an error.
Backslash escapes are processed both inside and outside quotes ¶
If AllowEmpty is false, consecutive separators are treated as a single separator and empty elements are discarded. If AllowEmpty is true, each separator produces a boundary and empty elements are preserved.
If a Charset is configured, all characters (after escape processing) are validated against the allowed character set. Invalid characters result in an InvalidCharError that includes the character and its position.
Errors are returned for dangling escape characters, unclosed quotes, or invalid characters.
Types ¶
type Charset ¶
type Charset struct {
// contains filtered or unexported fields
}
Charset defines which characters are allowed in parsed elements.
func NewCharset ¶
NewCharset creates a charset from a set of allowed runes.
type InvalidCharError ¶
InvalidCharError represents an error for an invalid character in the input.
func NewInvalidCharError ¶
func NewInvalidCharError(idx int, char rune, srcErr error) *InvalidCharError
NewInvalidCharError creates new InvalidCharError
func (*InvalidCharError) Error ¶
func (e *InvalidCharError) Error() string
func (*InvalidCharError) Unwrap ¶
func (e *InvalidCharError) Unwrap() error
type Option ¶
type Option func(*Options)
Option is a function that modifies parsing options.
func AppendProcessRuneFuncs ¶
func AppendProcessRuneFuncs(fns ...ProcessRuneFunc) Option
func WithAllowEmpty ¶
WithAllowEmpty sets whether empty elements should be preserved.
func WithCharset ¶
WithCharset sets the allowed character set for validation.
func WithProcessFunc ¶
func WithProcessFunc(fn ProcessFunc) Option
WithProcessFunc sets a function to process each element before adding to output.
func WithProcessRuneFuncs ¶
func WithProcessRuneFuncs(fns ...ProcessRuneFunc) Option
func WithSeparators ¶
WithSeparators sets the separator runes.
func WithWindowsPath ¶
func WithWindowsPath() Option
WithWindowsPath makes parser use Windows path charset and validates path with a regexp
type Options ¶
type Options struct {
Separators []rune
AllowEmpty bool
ProcessFunc ProcessFunc
ProcessRuneFuncs []ProcessRuneFunc
}
Options configures the string parsing behavior.
type ProcessFunc ¶
ProcessFunc is called for each parsed element before it's added to the output. It receives the element string and returns: - processed: the transformed string to use - skip: if true, the element is not added to the output - err: if non-nil, parsing stops and the error is returned
type ProcessRuneFunc ¶
type ProcessRuneFunc func(idx int, element string, escaped bool, c rune) (processed rune, skip bool, err error)
ProcessRuneFunc is called for each parsed character before it's added to the element Receives: - idx: index of the parsed rune - element: the element before adding parsed rune - escaped: if character is escaped (has backslash before it) - c: parsed rune Returns: - processed: rune that will be added to the final element - skip: if true, the element is not added to the output - err: if non-nil, parsing stops and the error is returned (with added character and it's index)
func NewCharsetProcessRuneFunc ¶
func NewCharsetProcessRuneFunc(charset *Charset) ProcessRuneFunc
func NewReplaceEscapedFunc ¶
func NewReplaceEscapedFunc(replaceMap map[rune]rune) ProcessRuneFunc