Suggestion: Preallocated file open
#1
Windows can do it and I think Linux can too.

On a file open "preallocate" space to be used.  AKA: no fragmentation is possible.

A command to open would look like: open "foo.txt" for output as #1 size=4096  <-- This would preallocate a file 4096k in size.

Of course this would not work for appending.  I only suggest this because a lot of huge files I write out become fragmented.

Just a thought.......
It could get very messy........
Reply
#2
How do you open a file for OUTPUT and get it fragmented?  All it does is append the data one line after the other automatically.

Can you share an example of this astonishing process?  The very idea boggles my brain competely.
Reply
#3
Would not be able to open for input/output except if "detailed adjustment" is involved.

Why Linux doesn't need disk defragmenting software is something I haven't gotten up to look up. However what probably "casper" (Ubuntu utility for persistence) does is create a giant file, as much as possible which has contiguous sectors. Then it's simply treated like any other file system which means low-level calls which are always "OPEN FOR RANDOM" in BASIC.

Definitely Porteus and others work like this which involve creating modules that are loaded at startup to add to functionality.

I don't know either what is done during startup while it reports "<<VOLUMELABEL>> clean: xx/yy files, aa/bb blocks", if the OS is willing to show it to you (for example Debian "Testing" or Manjaro KDE). Some kind of rearrangement of files must be done to be able later to allocate at least a 4GB single file.

Another thing "LEN = 4096" in OPEN statement is 4096 bytes not 4096kB nor 4096KB. On Linux as well as Windows, the larger the disks are, the larger the sector sizes have to be according to format for performance reasons.
Reply
#4
(01-09-2023, 01:24 AM)SMcNeill Wrote: How do you open a file for OUTPUT and get it fragmented?  All it does is append the data one line after the other automatically.

Can you share an example of this astonishing process?  The very idea boggles my brain competely.

Code: (Select All)
_Title "Hack it off"
'
' get home path
'
homepath$ = Command$(0): p = InStr(homepath$, "\"): home$ = Left$(homepath$, p): p = InStr(p + 1, homepath$, "\"): homepath$ = Left$(homepath$, p)

'
' fast clean first
'
If _FileExists(home$ + "save.txt") Then Kill home$ + "save.txt"

Open home$ + "save.org" For Binary As #1
Open home$ + "save.txt" For Output As #2
Print "Please be patient this may take some time ......"
Do Until EOF(1)
    Line Input #1, f$
    p = InStr(17, f$, "\")
    If p <> 0 Then Print #2, Right$(f$, Len(f$) - p)
Loop
System
save.org is a file list of directories which includes the full path in filename.
first 17 characters of each line is not important.  The first \ found is the end of directory path information.  Everything after is wanted.
It's saved to save.txt using print # statement.  This runs nightly in task scheduler.  Last night had 23 fragments for save.txt file.

Save.org starts at over 3mb of text.  Save.txt ends up just under 2mb of text.
Defraggler from Piriform software was used to determine the number of fragments created.
Reply
#5
Every file system fragments, Linux as well as Windows. Before Windows XP, one have ran a defragmenter. These were shareware programs where you could see how the distributed data on the hard drive was being merged. Since Win XP this has happened automatically in the background and is today no longer an issue at all.

To Linux. The file system also is fragmented and fragmented under ext2/3/4 & Co, but this was always done automatically there when booting up or during operation, so that the normal user never noticed anything about it. You can not see the work in the background either if KDE or Gnome is started right away, as probably 99% of users do.

Until the end of 2011, Suse 11.2, I always started my system with the command line and then started KDE via "startx". I could always see what was happening when booting. Then, at intervals of two or three weeks, one could see how the system checked the individual partitions and, of course, defragmented them at the same time.

Again as a conclusion: Every file system fragments, only this happens in the background, and the functions for this are getting better and better.


Then just like this: Login SuSE in the VirtualBox

[Image: pj7jm5mg.jpg]
Reply
#6
But this is the nature of M$'s operating system. But I think you already know that. It's frustrating when Windows works in this manner despite whatever goodness and advanced technologies NTFS promised. Less likely to fragment, yes but possible. With FAT the fragmentation is just inevitable. When someone needs more disk space and determines he/she doesn't need a file and it's big enough, that file is deleted. Usually after, a bunch of files are copied that easily fill in that gap. The OS was programmed a long time ago (since floppy disks) to remember that location most of all if it has to access the disk a few seconds after the last request on it. This is how it must act like.

This was a request for performance to get out of sequential tape which definitely there was no such thing as fragmentation.

I almost forgot to add that I have used payware defragmentation software for Windows in the past. Frustrated that it refused to work on WindowsXP SP3 out of SP1. Started becoming aware the OS became even greedier and more determined to wreck things the more that software was used to straighten it out, because apparently its own disk defrag was garbage.

Linux or MacOS might be the same way but somehow they turn out better. But must use their "native" systems such as "btrfs" or "ext4" for Linux. The fragmentation problem with FAT would remain regardless of OS.
Reply
#7
(01-09-2023, 03:36 PM)Kernelpanic Wrote: Again as a conclusion: Every file system fragments, only this happens in the background, and the functions for this are getting better and better.
(image)

Nice screen. Too bad now that distro and many others now rely on "systemd". A few days ago I created an installation with Gecko Linux which is based on OpenSuSE "Leap", with GNOME D.E. Although I dislike that D.E. I'm going to keep it, it's turning out better than I thought. I have to come to grips with slow YaST Software Manager and other OpenSuSE-specific stuff.
Reply
#8
(01-09-2023, 03:44 PM)mnrvovrfc Wrote:
(01-09-2023, 03:36 PM)Kernelpanic Wrote: Again as a conclusion: Every file system fragments, only this happens in the background, and the functions for this are getting better and better.
(image)

Nice screen. Too bad now that distro and many others now rely on "systemd". A few days ago I created an installation with Gecko Linux which is based on OpenSuSE "Leap", with GNOME D.E. Although I dislike that D.E. I'm going to keep it, it's turning out better than I thought. I have to come to grips with slow YaST Software Manager and other OpenSuSE-specific stuff.

Yast is OK, it only has its "little" quirks. I got to know all of them in almost 15 years with SuSE. . . but I never loved it.  Rolleyes
Reply
#9
As explained above, fragmentation is inherent of OS/filesystem.

On a side-note, when reading huge textfiles, this approach is a lot faster:

Code: (Select All)
Function BIG.read& (fileName$, eol$) ' 4M lines/sec
  Const BLOCKSIZE = 4194304 '=64*65536 = 4 MB
  If Not _FileExists(fileName$) Then BIG.read& = 0: Exit Function
  'Print "Reading lines from "; fileName$; " ";: cpos% = Pos(0)
  eoll% = Len(eol$)
  Dim block As String * BLOCKSIZE
  ff% = FreeFile
  Open fileName$ For Binary Access Read As #ff%
  blocks& = .5 + LOF(ff%) / Len(block)
  sep& = 0
  lines& = -1
  $Checking:Off
  For curblock& = 1 To blocks&
    Get #ff%, , block
    If curblock& > 1 Then
      buf$ = Mid$(buf$, sep&) + block
      r0& = InStr(buf$, eol$) + eoll%
    Else
      buf$ = block
      r0& = 1
    End If
    r1& = InStr(r0&, buf$, eol$)
    Do While r1& >= r0& And r0& > 0
      lin$ = Mid$(buf$, r0&, r1& - r0& + eoll%)
      ret% = BIG.line(lin$) ' Process lin$
      lines& = lines& + 1
      sep& = r1&: r0& = r1& + eoll%: r1& = InStr(r0&, buf$, eol$)
    Loop
    'Locate , cpos%, 0: Print lines&;
  Next curblock&
  $Checking:On
  Close #ff%
  buf$ = ""
  'Locate , cpos%, 0
  BIG.read& = lines&
End Function

Function BIG.line% (lin$)
  'process lin$ here
End Function
45y and 2M lines of MBASIC>BASICA>QBASIC>QBX>QB64 experience
Reply
#10
My point is because every file system fragments is not a good answer.  Linux is very good at not creating frags. There is a file open call present in Windows and Linux to preallocate a contiguous file.  If you know what you want to use for size.  Look at the Rsync options in linux one of them is --preallocate every file has a known length making it easy.
On windows "Winrar" does not preallocate hence lot's of fragmentation.  But using the copy function of "Total commander" does.  I have 1.5TB on a external HD.  Not one fragment created ever.  Any holes are not partially filled using TC.  TC just goes elsewhere to store it.  If the file is small enough to fit, it goes in the hole.  I love that part of TC.

If you are not using an SSD or Ramdrive.  Everytime you have to move a HD head it takes time.  To speed up QB64, a contiguous file is important.
Reply




Users browsing this thread: 8 Guest(s)