readdir with pages

greg's picture

He has: 1,581 posts

Joined: Nov 2005

What's the best method to limit readdir?
I see two options - limit by the readdir's while, or limit in the "if($folder)"

Limit by the "if($folder)" is clumsier than limiting by the while as it will still loop the entire dir after the desired number of folders has been found, whereas limiting the while will stop at the desired number.

So, is the following acceptable (it works):

<?php
$loop_count
= 0;
$handle = opendir(/folder);

while(
false !== ($folder = readdir($handle)) && $loop_count < 5){
  if(
$folder != "." && $folder != ".." && $folder != ".htaccess"){
 
$array[] = $folder;
 
$loop_count++;
  }
}

closedir($handle);
?>

My only concern is $loop_count < 5. What if there are only 4?
Will the false !== ($folder = readdir($handle)) simply break the loop and therefore it doesn't matter?

pr0gr4mm3r's picture

He has: 1,502 posts

Joined: Sep 2006

So you are wanting to get the first 'n' elements from a scandir()? I would use the array_chunk() function to split it up in chunks, and then take the first one.

Greg K's picture

He has: 2,145 posts

Joined: Nov 2003

This would be a recommended way to do it if you were needing some kind of sort of the files, however with the example given of just grabbing the first XX, there are two issues:

1. scandir will return all files including current (.) and parent (..) so will still need to be looped through to eliminate those

2. What if there is 1000 files, this would waste resources loading up a complete directory list

The following is a simplified version, and also uses the assumption that not only would you want to hide .htaccess, but any file starting with .ht (as by default apache is set to not allow access to them anyhow)

<?php

$dh
= opendir('test');
$aryFiles = array();

while (
count($aryFiles)<5 && ($file=readdir($dh))!==false) {
   if (
$file!='.' && $file!='..' && substr($file,0,3)!='.ht') {
     
$aryFiles[] = $file;
   }
}
closedir($dh);

?>

The reason I placed the count of the array first is because if there are already 5 items, no need to actually read the directory at all. (On an AND (&&), the first false value stops any other conditions from executing).

-Greg

greg's picture

He has: 1,581 posts

Joined: Nov 2005

Hmm, so get all then split into multi-dim-array for each page.

Perhaps also storing the array in a session for each page too to avoid repeat dir scanning... interesting ideas..

pr0gr4mm3r's picture

He has: 1,502 posts

Joined: Sep 2006

Are you using them for displaying pages stored in a folder?

greg's picture

He has: 1,581 posts

Joined: Nov 2005

It's for images for a portfolio.

Main folder with xx sub folders in it, each sub folder has xx images in (one subfolder for each 'portfolio job'). So my code in my first post does actually get folder names (not files).

The sub folders will likely never be more than 10 images, and not more than 100 sub-folders ever.

So dir is scanned and subfolder names are stored in an array. Then a foreach on the array displaying the first image in the sub-dir for that loop. Each image is a link to another page (php file) with GET info of the folder name, which then displays all the images from that sub-folder.

The image titles and other text is unfortunately stored in a file accessed when required. I know a database would have been really useful here, but it would have been the only use on the entire site for a DB, so I didn't bother with one.

@ Greg (no, you, not me...)
Isn't your code the same mine as in my first post?
I know you have count() and I increment a var, but isn't it doing the same thing?
(albeit yours looks more efficient than mine)

Greg K wrote:
2. What if there is 1000 files, this would waste resources loading up a complete directory list
Firstly there wont be (sites can have known future proof limitations)

And without storing all subfolders in an array, how else will I know which are to be on page 1, page 2, page 3 etc?
Example: I display 10 images per page (one from each subfolder), and have 50 images total.
How would I determine, and get, only the LAST 10 images for page 5?

Greg K's picture

He has: 2,145 posts

Joined: Nov 2003

I misunderstood what the file structure you were looking for. Will check it more when I get home.

They have: 121 posts

Joined: Dec 2008

I don't see anything wrong with your proposed solution...
The docs mention readdir($handle) returns boolean false when no results are available (You're done), which will cause your while clause to evaluate to false, and the loop to stop executing.

If you're still not sure, give it a little test run... Create 4 elements, add a debug 'echo' or two, and see how things shake out.

Cheers,
Shaggy.

greg's picture

He has: 1,581 posts

Joined: Nov 2005

yeah, I did try that, printed the array, and with my clause of $loop_count < 5 in the while, it did stop at the number I changed the < 5 to.
The array only held 2 folder names, or 5 (etc)

I just like throwing things out there for discussion. I enjoy writing code to a good standard, and there's often something someone else knew I didn't - a more efficient method or even just a slight tweak.

Although I still don't know if adding all subfolder names to a session array for when the user goes to other pages is best, or if another way would be better.
I don;t see another way (without DB) to determine which subolders would be on what page.
Perhaps I could name the folders numerical, 1, 2, 3, 4 etc, but that seems weird somehow..?

They have: 121 posts

Joined: Dec 2008

The file system is going to be at _least_ as fast as any database for simple reads like this.

If the contents of the folders are going to be changing a lot, especially mid-user-session, you may want to consider a 'read once and cache for the user session approach' suggested. Changing (or maybe even touching?)the directory's contents will likely change the order of scandir's results, which may throw someone somewhere down the line...

Numerical folders for each 'page' as you suggest isn't a bad idea either!

My guess would be the numerical folders for each page would be the 'most efficient' (fastest) if that is what you are after, though would take a bit more care to maintain.

Cheers,
Shaggy

Want to join the discussion? Create an account or log in if you already have one. Joining is fast, free and painless! We’ll even whisk you back here when you’ve finished.