Text Parser
#1
UPDATE: I made changes to the code with great suggestions and code examples from Pete. The routine is much more stable now. Thanks Pete!
By the way - writing text parsing routines is much harder than it looks. So much to consider in their design.

While writing the library for lesson 20 of the tutorial I wrote a text parser that's fairly efficient. I needed a parsing routine that reported the number of lines the input text string would be parsed to and could deliver one line of text at a time on demand. This is what I came up with:

Code: (Select All)
FUNCTION ParseText$ (TextIn AS STRING, MaxWidth AS INTEGER, Action AS INTEGER) STATIC

    '-> Modifications added suggested by Pete 10/03/22. Corrects issue of crashing and mishandling of text
    '   in certain situations. (additions remarked in code below)
    '   - Function exit if text sent in is null
    '   - Handles non-breaking strings larger than space allocated
    '   - Handles end of text that has no trailing space
    '   - Clear text for next parsing event

    '-> Parses the string passed in into multiple lines of the maximum width desired
    '   The first time the function is called, regardless of action, the TextIn string is fully parsed
    '   Subsequent calls to the function with the same text will not parse the TextIn string again

    '-> INPUT PARAMETERS:
    '   TextIn   - the text string sent into the function
    '   MaxWidth - maximum width of text on a line
    '   Action   - 1 reports number of lines created, 0 returns lines of parsed text ("" = finished)

    '-> EXAMPLE:
    '   t$ = "The rain in Spain falls mainly on the plain. The weather in Spain seems pretty good!"
    '   Lines = VAL(ParseText$(t$, 40, 1)) '          report number of lines the text was parsed into
    '   DO '                                          get all parsed lines of text
    '       TextLine$ = ParseText$(t$, 40, 0) '       return the next line of parsed text
    '       IF TextLine$ <> "" THEN PRINT TextLine$ ' could easily be saved to an array as well
    '   LOOP UNTIL TextLine$ = "" '                   ParseText$ returns null when all lines returned

    '-> There is no need to have ParseText$ report the number of lines needed. A simple counter could be
    '   placed within the DO...LOOP as well. The report for the number of lines the text was parsed into will
    '   be returned as a string and need converted to a value VAL() as seen in the 2nd line of the example.

    DIM PText AS STRING '    previous text that was sent in
    DIM Index AS INTEGER '   array index counter
    DIM Plen AS INTEGER '    parse string length
    DIM Char AS STRING * 1 ' character analyzer
    DIM Parse AS STRING '    parsed string
    DIM WText AS STRING '    working text string
    DIM Done AS INTEGER '    flag to indicate parsing finished

    IF MaxWidth <= 0 THEN ParseText$ = "": EXIT FUNCTION '        (Pete) leave if null text sent in
    IF PText <> TextIn THEN '                                            was a new text string sent in?
        PText = TextIn '                                                 yes, remember text that was sent in
        WText = TextIn '                                                 get text sent in to work with
        Index = 0 '                                                      reset index counter
        Done = 0 '                                                       reset finished flag
        REDIM Text(0) AS STRING '                                        reset text array
    END IF
    IF NOT Done THEN '                                                   has parsing already been performed?
        DO '                                                             no, begin array loop
            Index = Index + 1 '                                          increment index counter
            REDIM _PRESERVE Text(Index) AS STRING '                      increase size of array

            ' (Pete) Non-breaking string larger than space alloted checked below.

            IF LEN(WText) > MaxWidth AND INSTR(MID$(WText, 1, MaxWidth + 1), " ") = 0 THEN ' (Pete)
                Plen = MaxWidth '                                 (Pete)  set length to maximum size
                Parse = MID$(WText, 1, Plen) '                    (Pete)  get the maximum size string allowed
            ELSE
                IF MID$(WText, MaxWidth + 1, 1) <> " " THEN '     (Pete) text with no trailing space?
                    Plen = MaxWidth '                             (Pete) yes, set length to remaining text
                ELSE '                                                   no, there is a a trailing space
                    Plen = MaxWidth + 1 '       (+1 for trailing spaces) set length to include space
                END IF
                DO '                                                     begin parse loop
                    IF LEN(WText) <= Plen THEN '                         remaining text all that is left?
                        Parse = MID$(WText, 1, Plen) '                   yes, get remaining text
                        Done = -1 '                                      parsing is done
                    ELSE '                                               no, text still longer than max width
                        IF INSTR(MID$(WText, 1, Plen), " ") = 0 THEN '   space found in text? (Pete)
                            Plen = Plen + 1 '                     (Pete) no, increment length
                        ELSE
                            DO '                                         begin space search loop
                                Char = MID$(WText, Plen, 1) '            get last character
                                IF Char <> " " THEN Plen = Plen - 1 '    if not a space then move back one
                            LOOP UNTIL Char = " " '                      leave when space found
                            Parse = LEFT$(WText, Plen - 1) '             get parsed string without space at end
                        END IF
                    END IF
                LOOP UNTIL Char = " " OR Done '                          leave when space found or parsing done
            END IF
            Text(Index) = Parse '                                        save the parsed text
            IF NOT Done THEN WText = MID$(WText, Plen + 1, LEN(WText)) ' remove parsed text from string
        LOOP UNTIL Done '                                                leave when parsing done
        Index = 0 '                                                      reset index counter for reporting
    END IF
    IF Action = 1 THEN '                                                 report number of lines?
        ParseText$ = STR$(UBOUND(Text)) '                                yes, return number of lines as a string
    ELSE '                                                               no, report parsed lines found
        Index = Index + 1 '                                              increment index counter
        IF Index > UBOUND(Text) THEN '                                   have all lines been reported?
            ParseText$ = "" '                                            yes, report nothing remaining
            PText = "" '                                          (Pete) clear for next parsing event
        ELSE '                                                           no, parsed text remains
            ParseText$ = Text(Index) '                                   report next line of text
        END IF
    END IF

END FUNCTION

Drop the function into your own code and parse away.
Reply


Messages In This Thread
Text Parser - by TerryRitchie - 10-03-2022, 12:28 AM
RE: Text Parser - by mnrvovrfc - 10-03-2022, 12:49 AM
RE: Text Parser - by TerryRitchie - 10-03-2022, 01:07 AM
RE: Text Parser - by RhoSigma - 10-03-2022, 07:55 AM
RE: Text Parser - by Pete - 10-03-2022, 11:01 AM
RE: Text Parser - by James D Jarvis - 10-03-2022, 11:35 AM
RE: Text Parser - by RhoSigma - 10-03-2022, 11:51 AM
RE: Text Parser - by TerryRitchie - 10-03-2022, 01:18 PM
RE: Text Parser - by mnrvovrfc - 10-03-2022, 02:47 PM
RE: Text Parser - by TerryRitchie - 10-03-2022, 03:03 PM



Users browsing this thread: 5 Guest(s)