Trying to find keywords in text and count amount of times occuring

benf posted this at 15:40 — 22nd February 2010.

Joined: Feb 2005

I stored some keywords in a database and am using file_get_contents to get a webpage and then loop through the keywords from the database to try and match them in the webpage, counting the amount of times it occures. Below is what I have. But, if I want to search for Jap then it will count this word twice, once in Jap and once for Japanese? I tried using \b in my regex but it failed? Any ideas?

while($row = mysql_fetch_array($query)){
    if (preg_match_all("/$row[0]/i",$text,$matches) > 0){  
      $count = preg_match_all("/$row[0]/i",$text,$matches);
      $s[$x] = array('word'=>$row[0],'occur'=>$count); 
    }
    $x++;
}

Good Value Professional VPS Hosting

benf posted this at 12:12 — 25th February 2010.

They have: 426 posts

Joined: Feb 2005

Ok so I figured this out. First I replace all occurances of HTML tags with spaces using this function:

function striptags ($str) {
 return trim(strip_tags(str_replace('<', ' <', $str)));
}

and then use the below while loop and preg_match_all using the given regex (\b either side is the key) to count the amount of times the keyword occurs:

  while($row = mysql_fetch_array($query)){
        if (preg_match("/\b$row[0]\b/i",$url) > 0){
          $count = preg_match_all("/\b$row[0]\b/i",$url,$match);
          $s[$x] = array('word'=>$row[0],'occur'=>$count); 
        }
        $x++;
   }

Good Value Professional VPS Hosting