Storing stuff
#1
Every time in the past, when I need to store something to file, I always used a user-defined type and wrote and read the contents to a file using INPUT and OUTPUT. Now starting to design the actual gradebook, I am a little stumped. Naturally, an array sounds useful for something like this. My initial thought is assigning every student a unique ID (UID), unknown to the user but used by me, so if the name changes or something of the sort, we still can line the grades with the proper name.

So, the thinking is UID, Grade or mark, and notes. I am not overly experienced using multi-dimensional arrays. Is there any issues resizing them using REDIM PRESERVE. Is there a better idea for keeping and using this information or storing it with INPUT/OUTPUT won't be overly taxing since I would have to overwrite the file every time I make an update? Using RANDOM sounds like I have to make all these decision on the front-end... I guess I am just looking for a nudge in the right direction since I am kinda inexperinced with this type of programming  Cry

Thanks y'all
Reply
#2
I've developed a member registration system for clubs some years ago.
I really prefer files in csv-format for these kind of things.
Just load them all in memory (typed arrays) at the start of the program, and save them when closing (or chosing 'save' from the menu).
It's easier and very fast in searching etc.
Every member and every address has a unique internal ID that never gets reused. Cancelled membership is just a property (end-date) of a member. Can always be used in the future for lustrums or alumni events...
45y and 2M lines of MBASIC>BASICA>QBASIC>QBX>QB64 experience
Reply
#3
Code: (Select All)
'How to write my data
'StudentID:###...
'FirstName:$$$...
'LastName: $$$...
'START PERIOD: #
'START GRADES
'Grade:    ###
'Note :    $$$...
'Grade:    ###
'Note :    $$$...
'Grade:    ###
'Note :    $$$...
'END GRADES
'Average: ###
'Note  : $$$...
'END PERIOD #
'Semester 1 Exam: ###
'Note  : $$$...
'Semester 1 Avrg: ###
'Note  : $$$...
'Semester 2 Exam: ###
'Note  : $$$...
'Semester 2 Avrg: ###
'Note  : $$$...
'Final Grade  :  ###
'Note  : $$$...
'Note  : $$$...
'Note  : $$$...
'Note  : $$$...
'End StudentID:  ###...


This is a simple enough, and human enough, structure that you should be able to open up the data file in any text editor, and decipher the results.  (In case some new system security feature or school rule makes your program not work in the future.)

The way I'd do this is to open my file FOR OUTPUT (Yes, that means it'd be overwritten each time I updated my records, but we're not talking 200GB databases here...  Small data sets, small problems with rewriting them over and over!).  Then I'd just PRINT my data on each line.

Code: (Select All)
PRINT #1, "StudentID: "; StudentID
PRINT #1, "First Name: "; fname$
PRINT #1, "Last Name: "; lname$
FOR I = 1 TO NumberOfPeriods '1st Six Weeks is 1, 2nd Six Weeks is 2, 3rd Six Weeks is 3...
    PRINT #1, "START PERIOD: "; I
    PRINT #1, "START GRADES"
    FOR J = 1 TO NumberOfGrades 'number in this period
       PRINT #1, "Grade: "; Grade(J)
       PRINT #1, "Notes: "; Notes(J)
    NEXT
    PRINT #1, "END GRADES"
    PRINT #1, "Average: "; Average(I)
    PRINT #1, "Notes: "; PeriodNotes(I)
    PRINT #1, "END PERIOD: "; I
NEXT
...and then print the final results and semester exam/averages

Writing it in such a manner, I can then read it back with DO-LOOPS, so my data set can be as irregular as needed.  I don't need to plan for 10 tests in the first 6 weeks.  When it comes to input, the main process would be:

Code: (Select All)
DO
    count = count + 1
    LINE INPUT #1, temp$
    IF temp$ = "END GRADES" THEN EXIT DO
    Grade(count) = VAL(MID$(temp$, INSTR(temp$, ":") +1)
    LINE INPUT #1, temp$
    Note(count) = MID$(temp$, INSTR(temp$, ":") +1
LOOP

Here, I read my data and assign it to my array -- and I'm reading it until I come across that END OF DATA type marker -- in this case, "END GRADES" and "END PERIOD".  When I read the data file and see "START GRADES", I know I'm going to have a DO..LOOP where I read my data as GRADE, NOTE, each on a separate line.  I continue to read GRADE, NOTE, until I finally come to that EOD marker "END GRADES", and then I can read what comes next in my data file.   (In this case, the period average.)

StartOfData
grade
note
grade
note
grade
note
repeat as many times as you like with the grade then note...
EndOfData

^ This type structure doesn't have a hardcoded limit to it.  It could contain 1 record, or it could contain 3000.  What we know about it is that that data is always going to be in the format of GRADE on one line, followed by NOTE on the next line.  We just read our data file line by line, checking for our EOD marker, and until we see it, we keep adding to our list.

It's a very simple type structure, but very flexible, and it molds itself quite well for use with variable length strings.  I really think it'd be the way I'd go about storing the data you're talking about to my drive.  Wink
Reply
#4
(12-14-2022, 02:17 PM)SMcNeill Wrote:
Code: (Select All)
'How to write my data
'StudentID:###...
'FirstName:$$$...
'LastName: $$$...
'START PERIOD: #
'START GRADES
'Grade:    ###
'Note :    $$$...
'Grade:    ###
'Note :    $$$...
'Grade:    ###
'Note :    $$$...
'END GRADES
'Average: ###
'Note  : $$$...
'END PERIOD #
'Semester 1 Exam: ###
'Note  : $$$...
'Semester 1 Avrg: ###
'Note  : $$$...
'Semester 2 Exam: ###
'Note  : $$$...
'Semester 2 Avrg: ###
'Note  : $$$...
'Final Grade  :  ###
'Note  : $$$...
'Note  : $$$...
'Note  : $$$...
'Note  : $$$...
'End StudentID:  ###...


This is a simple enough, and human enough, structure that you should be able to open up the data file in any text editor, and decipher the results.  (In case some new system security feature or school rule makes your program not work in the future.)

The way I'd do this is to open my file FOR OUTPUT (Yes, that means it'd be overwritten each time I updated my records, but we're not talking 200GB databases here...  Small data sets, small problems with rewriting them over and over!).  Then I'd just PRINT my data on each line.

Code: (Select All)
PRINT #1, "StudentID: "; StudentID
PRINT #1, "First Name: "; fname$
PRINT #1, "Last Name: "; lname$
FOR I = 1 TO NumberOfPeriods '1st Six Weeks is 1, 2nd Six Weeks is 2, 3rd Six Weeks is 3...
    PRINT #1, "START PERIOD: "; I
    PRINT #1, "START GRADES"
    FOR J = 1 TO NumberOfGrades 'number in this period
       PRINT #1, "Grade: "; Grade(J)
       PRINT #1, "Notes: "; Notes(J)
    NEXT
    PRINT #1, "END GRADES"
    PRINT #1, "Average: "; Average(I)
    PRINT #1, "Notes: "; PeriodNotes(I)
    PRINT #1, "END PERIOD: "; I
NEXT
...and then print the final results and semester exam/averages

Writing it in such a manner, I can then read it back with DO-LOOPS, so my data set can be as irregular as needed.  I don't need to plan for 10 tests in the first 6 weeks.  When it comes to input, the main process would be:

Code: (Select All)
DO
    count = count + 1
    LINE INPUT #1, temp$
    IF temp$ = "END GRADES" THEN EXIT DO
    Grade(count) = VAL(MID$(temp$, INSTR(temp$, ":") +1)
    LINE INPUT #1, temp$
    Note(count) = MID$(temp$, INSTR(temp$, ":") +1
LOOP

Here, I read my data and assign it to my array -- and I'm reading it until I come across that END OF DATA type marker -- in this case, "END GRADES" and "END PERIOD".  When I read the data file and see "START GRADES", I know I'm going to have a DO..LOOP where I read my data as GRADE, NOTE, each on a separate line.  I continue to read GRADE, NOTE, until I finally come to that EOD marker "END GRADES", and then I can read what comes next in my data file.   (In this case, the period average.)

StartOfData
grade
note
grade
note
grade
note
repeat as many times as you like with the grade then note...
EndOfData

^ This type structure doesn't have a hardcoded limit to it.  It could contain 1 record, or it could contain 3000.  What we know about it is that that data is always going to be in the format of GRADE on one line, followed by NOTE on the next line.  We just read our data file line by line, checking for our EOD marker, and until we see it, we keep adding to our list.

It's a very simple type structure, but very flexible, and it molds itself quite well for use with variable length strings.  I really think it'd be the way I'd go about storing the data you're talking about to my drive.  Wink

There are some good ideas and helped me think how I can be more flexible. When I said plan, I didn't mean for the teacher but for me. Already been scratching on paper on and off during my break time and decided I want a flag as well, for missing or late turn-in so I can do some color coding for easy reading. I just had to convert my data file over for my real students for the reports to add that UID for each one and that was a bit of a hassle. I guess I am trying to avoid that again.

I guess sometimes I get too caught up in being efficient sometimes with read and writes. You are right. I'd be surprised if any one of my files clear a MB or two. Thanks for helping out! I can't wait to share the next update with you all. It is coming along nicely! Big Grin
Reply
#5
Obligatory SQLite3 / MySQL plug
Ask me about Windows API and maybe some Linux stuff
Reply
#6
Quote:Thanks y'all

You must be from Southern China!

Random Access files are nice if you are going to be overwriting lots of entries, otherwise sequential files are the way I usually roll. (Another funny expression). Anyway, If you go with sequential, use OPEN FOR BINARY instead of FOR INPUT. Why? Because it's about 10-times faster. With the extra speed, you really don't even have to build hash tables or use SEEK methods to get to dated entries, etc. In other words for less than millions of records, you don't need to design a more complicated file structure.

In school, I got straight A's, except in math, where they were rounded.

Pete
If eggs are brain food, Biden takes his scrambled.
Reply
#7
@NasaCow
my two cents:


https://www.youtube.com/watch?v=AkCsREkq...2&index=16



database manage by hand or using DLL as for example SQL3?
Sequential text/binary file?  Or UDT random file?
Using one file or more files? Using an hash table or pointers to forward/backward block of data (see nodes with one or two pointers).


Well how much time do you want to spend in project and debugging a database?
QB64 gives you all the opportunities to build up any kind of database!
Good luck but moreover good projecting respecting the amount of work that this you database must support, thinking about restoring a corrupt file too.
Waiting your developments...
Reply
#8
(12-14-2022, 08:04 PM)Pete Wrote:
Quote:Thanks y'all

You must be from Southern China!

Random Access files are nice if you are going to be overwriting lots of entries, otherwise sequential files are the way I usually roll. (Another funny expression). Anyway, If you go with sequential, use OPEN FOR BINARY instead of FOR INPUT. Why? Because it's about 10-times faster. With the extra speed, you really don't even have to build hash tables or use SEEK methods to get to dated entries, etc. In other words for less than millions of records, you don't need to design a more complicated file structure.

In school, I got straight A's, except in math, where they were rounded.

Pete

My mother's side is from South Carolina, that's about as southern as I get. Beijing is in the bitter cold north (we don't get snow typically, so it is a win/win). Is OPENing for BINARY the same type of usage as INPUT and OUTPUT. I have only ever worked UDT with input and output. Ideally, I'd like to record the changes as then there is no need to save. The wiki isn't friendly on how to use binary as a file type. Perhaps another magical piece of demonstration code can help me get my head around it... I am using the release over top choice for selection for the mouse. The code you shared was amazing and helpful! Thanks
(12-14-2022, 11:39 PM)TempodiBasic Wrote: @NasaCow
my two cents:


https://www.youtube.com/watch?v=AkCsREkq...2&index=16



database manage by hand or using DLL as for example SQL3?
Sequential text/binary file?  Or UDT random file?
Using one file or more files? Using an hash table or pointers to forward/backward block of data (see nodes with one or two pointers).


Well how much time do you want to spend in project and debugging a database?
QB64 gives you all the opportunities to build up any kind of database!
Good luck but moreover good projecting respecting the amount of work that this you database must support, thinking about restoring a corrupt file too.
Waiting your developments...

That's the thing, I have a lot of choice, but I am not really informed to make a choice. I have lived and died by UDT input and output files (not even random, just load the whole thing at once and write it all again if there is a change.) Right now, VPNs and China are not playing nice. When Express gets their service working again, I will check out the video for sure. I am guessing altogether I need somewhere between 20 and 40 students, up to 600 assignments (that's three a day per every school day, I don't think any teacher would exceed that in a school year...), comments availability for each, and flags (I think a single byte could do this, up to 8 different flags). UDT seems clunky to me... What are your thoughts? I do appreciate the feedback   Cool
Reply
#9
QB64 added OPEN FOR BINARY to sequential files. Using it speeds up the process of LINE INPUT. So instead of...

OPEN "MYFILE.DAT" FOR INPUT AS #1
    DO
        LINE INPUT #1, a$
    LOOP UNTIL EOF(1)
CLOSE #1

we do...

OPEN "MYFILE.DAT" FOR BINARY AS #1
    DO
        LINE INPUT #1, a$
    LOOP UNTIL EOF(1)
CLOSE #1


Now a couple of things to note here:

1) ...FOR INPUT won't create a non-existing file, but FOR BINARY will.
2) An empty ...FOR INPUT file will just exit the loop. An empty ...FOR BINARY will error out.

So here is how to bullet proof the ...FOR BINARY method:

Code: (Select All)
IF _FILEEXISTS("MYFILE.DAT") THEN
    OPEN "MYFILE.DAT" FOR BINARY AS #1
    IF LOF(1) THEN ' Checks to see if there is any bytes of data in the file. Zero means it is empty, false, so the condition gets bypassed.
        DO
            LINE INPUT #1, a$
        LOOP UNTIL EOF(1) ' End of file
    END IF
    CLOSE #1
END IF

I've made use of sequential files that can overwrite data so I'm just as happy using them as random access, and I don't have to be concerned about record numbers or the length of the records. Overwriting is a bit more involved in sequential files, as it requires writing to a temp file. Basically write all data up to the block you are hanging to the temp file then write the changed data then write the remaining data. Now just kill the original database file and name the temp file as the database file.


I've worked with sequential files using an "EOR" as my made up End of record marker. The structure is pretty much what Steve described. Nice in that I could put whatever I wanted into each labeled record. In other words, no pattern required, very flexible so I don't need a structure. All I did to find what I wanted out of file was add an identifier to the line of data then parse it out while line inputting it...

[name] Clippy
[date] 01-01-2000
[assignment] Math 101
[grade] F
[comments] Clippy is an idiot.
[EOR]

Now it doesn't even matter if I do the next record in this order, I can search what I need with [date], [assignment], etc. and pop the data being read into an array. mydata$(1) = MID$(a$, INSTR(a$, "[assignment]") + 2) would store "Math 101" for me to view when everything else I need from this file has been pulled.

A second, and more organized method, requires you know the number of elements per record. In the above example, we have 5, name, date, assignment, grade, comments. If that just keeps getting repeated, you don't need any labeling to identify the entries. You identify them by numbers 1 - 5. Of course if we had 19 other students this one assignment would be 100 entries. This is where MOD comes in...

Code: (Select All)
IF _FILEEXISTS("MYFILE.DAT") THEN
    OPEN "MYFILE.DAT" FOR BINARY AS #1
    IF LOF(1) THEN
        DO
            i = i + 1 'Counter
            LINE INPUT #1, a$
            j = i mod 5: IF j = 0 THEN j = 5 ' last record in the list of 5.
            IF j = 3 THEN PRINT j, a$ ' Let's just print the assignment.
        LOOP UNTIL EOF(1) ' End of file
    END IF
    CLOSE #1
END IF

If this is per class, you might be dealing with 600 assignments * 20 students * 5 entries 60,000 lines of data. QB64 ...FOR BINARY can read a file of that size fast enough without the complexity of hashing or indexing. I have many around 40,000 that process just fine, but if you were to make a database for every student in the whole school for every school year for the next 20 years, you'd need more of the mySQL method.

Anyway. My advice would be to figure out what database format you want; a fixed block like the example above, or one with flexibility to add stuff as you go like the other example. Once you're settled on something, write a very small test database and post it if you have any questions.

Good luck,

Pete
Reply
#10
(12-15-2022, 11:51 AM)Pete Wrote: If this is per class, you might be dealing with 600 assignments * 20 students * 5 entries 60,000 lines of data. QB64 ...FOR BINARY can read a file of that size fast enough without the complexity of hashing or indexing. I have many around 40,000 that process just fine, but if you were to make a database for every student in the whole school for every school year for the next 20 years, you'd need more of the mySQL method.

Anyway. My advice would be to figure out what database format you want; a fixed block like the example above, or one with flexibility to add stuff as you go like the other example. Once you're settled on something, write a very small test database and post it if you have any questions.

I am reading/writing (csv) files of multiple GB in memory structures in just seconds by reading binary blocks of 4MB (fastest)
This can also be used for above formats if the file might grow huge.
Just let me know when you need that
45y and 2M lines of MBASIC>BASICA>QBASIC>QBX>QB64 experience
Reply




Users browsing this thread: 1 Guest(s)