(05-25-2023, 11:04 PM)Kernelpanic Wrote: I know the "StringTokenizer" class from Java. Recreating this might not be easy. It would probably make more sense to be able to call a corresponding program in Java from QB64 with the transfer of a text. Just like it is with C.
In Java:
Code: (Select All)/* StrinkTokenizer Beispiel - 26. Mai 2023 */
import java.util.*;
public class BeispielToken
{
public static void main(String[] args)
{
String s = "Dies ist nur ein Test";
StringTokenizer st = new StringTokenizer(s);
while (st.hasMoreTokens())
{
System.out.println(st.nextToken());
}
}
}
The Java StringTokenizer is exactly what the design of this is based on. And after looking at RhoSigma's code I took some inspiration and got carried away. lol.
Code: (Select All)
$CONSOLE:ONLY
OPTION _EXPLICIT
REDIM mytokens(-2 TO -2) AS STRING
DIM s AS STRING: s = "Function MyFunc(MyStr As String, Optional MyArg1 As Integer = 5, Optional MyArg2 = 'Dolores Abernathy')"
DIM n AS LONG: n = TokenizeString(s, "(),= ", 0, "''", mytokens())
PRINT n; " tokens parsed"
DIM i AS LONG
FOR i = LBOUND(mytokens) TO UBOUND(mytokens)
PRINT i; "="; mytokens(i)
SLEEP 1
NEXT
END
' Tokenizes a string to a dynamic string array
' text - is the input string
' delims - is a list of delimiters (multiple delimiters can be specified)
' tokens() - is the array that will hold the tokens
' returnDelims - if True, then the routine will also return the delimiters in the correct position in the tokens array
' quoteChars - is the string containing the opening and closing "quote" characters. Should be 2 chars only
' Returns: the number of tokens parsed
FUNCTION TokenizeString& (text AS STRING, delims AS STRING, returnDelims AS _BYTE, quoteChars AS STRING, tokens() AS STRING)
DIM strLen AS LONG: strLen = LEN(text)
IF strLen = 0 THEN EXIT FUNCTION ' nothing to be done
DIM arrIdx AS LONG: arrIdx = LBOUND(tokens) ' we'll always start from the array lower bound - whatever it is
DIM insideQuote AS _BYTE ' flag to track if currently inside a quote
DIM token AS STRING ' holds a token until it is ready to be added to the array
DIM char AS STRING * 1 ' this is a single char from text we are iterating through
DIM AS LONG i, count
' Iterate through the characters in the text string
FOR i = 1 TO strLen
char = CHR$(ASC(text, i))
IF insideQuote THEN
IF char = RIGHT$(quoteChars, 1) THEN
' Closing quote char encountered, resume delimiting
insideQuote = 0
GOSUB add_token ' add the token to the array
IF returnDelims THEN GOSUB add_delim ' add the closing quote char as delimiter if required
ELSE
token = token + char ' add the character to the current token
END IF
ELSE
IF char = LEFT$(quoteChars, 1) THEN
' Opening quote char encountered, temporarily stop delimiting
insideQuote = -1
GOSUB add_token ' add the token to the array
IF returnDelims THEN GOSUB add_delim ' add the opening quote char as delimiter if required
ELSEIF INSTR(delims, char) = 0 THEN
token = token + char ' add the character to the current token
ELSE
GOSUB add_token ' found a delimiter, add the token to the array
IF returnDelims THEN GOSUB add_delim ' found a delimiter, add it to the array if required
END IF
END IF
NEXT
GOSUB add_token ' add the final token if there is any
IF count > 0 THEN REDIM _PRESERVE tokens(LBOUND(tokens) TO arrIdx - 1) AS STRING ' resize the array to the exact size
TokenizeString = count
EXIT FUNCTION
' Add the token to the array if there is any
add_token:
IF LEN(token) > 0 THEN
tokens(arrIdx) = token ' add the token to the token array
token = "" ' clear the current token
GOSUB increment_counters_and_resize_array
END IF
RETURN
' Add delimiter to array if required
add_delim:
tokens(arrIdx) = char ' add delimiter to array
GOSUB increment_counters_and_resize_array
RETURN
' Increment the count and array index and resize the array if needed
increment_counters_and_resize_array:
count = count + 1 ' increment the token count
arrIdx = arrIdx + 1 ' move to next position
IF arrIdx > UBOUND(tokens) THEN REDIM _PRESERVE tokens(LBOUND(tokens) TO UBOUND(tokens) + 512) AS STRING ' resize in 512 chunks
RETURN
END FUNCTION
I'll update the main post.