astuces_jeux Messages postés 731 Date d'inscription mercredi 15 novembre 2000 Statut Membre Dernière intervention 27 mai 2010 - 20 nov. 2006 à 08:07
astuces_jeux Messages postés 731 Date d'inscription mercredi 15 novembre 2000 Statut Membre Dernière intervention 27 mai 2010 - 21 nov. 2006 à 09:24
j'ai créer mon propre moteur de recherche sur le web mais je dois mettre dans mysql page par page ou y a un autre moyen ???

8 réponses

syndrael Messages postés 2378 Date d'inscription lundi 4 février 2002 Statut Membre Dernière intervention 29 décembre 2012 20
20 nov. 2006 à 08:40
Je te conseille PhpDig, assez simple à installer. Il y a aussi
MnoGoSeacrh mais là tu auras besoin d'un peu de Perl à mes bons

astuces_jeux Messages postés 731 Date d'inscription mercredi 15 novembre 2000 Statut Membre Dernière intervention 27 mai 2010
20 nov. 2006 à 15:41
et on a sa ou ???
syndrael Messages postés 2378 Date d'inscription lundi 4 février 2002 Statut Membre Dernière intervention 29 décembre 2012 20
20 nov. 2006 à 16:19
Dans toutes les bonnes boucheries.. Google est ton ami --> PhpDig et hop..

Tu en as besoin d'autres ???

astuces_jeux Messages postés 731 Date d'inscription mercredi 15 novembre 2000 Statut Membre Dernière intervention 27 mai 2010
20 nov. 2006 à 16:49
sa daccord mais sa marche comment PHPDig ??? en fait moi je me suis servi du code rottmanh's search ou un truc comme sa et sa est pret mais en fait y a pas un logiciel qui envoi toutes les pages vues sur la base de données ??? ou pour ajouter toutes les pages ??? pour mon moteur de recherche web mais si PHPDig fait ce que je demandes avec un autre moteur de recherche moi je veux bien !!!

syndrael Messages postés 2378 Date d'inscription lundi 4 février 2002 Statut Membre Dernière intervention 29 décembre 2012 20
20 nov. 2006 à 18:32
pourkoa faut-il ke la page soit vue pour être dans la base de données ? Es-tu sur de vouloir un moteur de recherche ?

Le principe d'un moteur de recherche est lors de la CREATION de ta page
de l'indexer, ça veut dire ke ça la découpe en autant de mots qui
pourraient être demandés dans ton moteur.

Bonne chance.

astuces_jeux Messages postés 731 Date d'inscription mercredi 15 novembre 2000 Statut Membre Dernière intervention 27 mai 2010
20 nov. 2006 à 18:34
je ne comprend pas !! je doismettre les pages dedans comment ??
syndrael Messages postés 2378 Date d'inscription lundi 4 février 2002 Statut Membre Dernière intervention 29 décembre 2012 20
20 nov. 2006 à 18:41
Le plus simpe est ke tu installe PhpDig et tu verras l'interface
d'administration, là tu remarqueras une fonciton ki te permet d'indexer
tes pages à partir d'un emplacement de ton site..

Bonne chance.

astuces_jeux Messages postés 731 Date d'inscription mercredi 15 novembre 2000 Statut Membre Dernière intervention 27 mai 2010
21 nov. 2006 à 09:24
oui daccord mais je l'install puis dans _connect.php ou un truc comme sa (dans include) je met les infos de ma base puis je met sur internet par ftp,
et je tape http://monsite/phpdig/admin/install.php et sa me met :
Fatal error: Call to undefined function: mb_eregi() in /phpdig/includes/config.php on line 109

erreur a la ligne 109 qui est :


voici le code de config.php :

PhpDig Version 1.8.x - See the config file for the full version number.
This program is provided WITHOUT warranty under the GNU/GPL license.
See the LICENSE file for more information about the GNU/GPL license.
Contributors are listed in the CREDITS and CHANGELOG files in this package.
Developer from inception to and including PhpDig v.1.6.2: Antoine Bajolet
Developer from PhpDig v.1.6.3 to and including current version: Charter
Copyright (C) 2001 - 2003, Antoine Bajolet,

Copyright (C) 2003 - current, Charter,

Contributors hold Copyright (C) to their code submissions.
Do NOT edit or remove this copyright or licence information upon redistribution.
If you modify code and redistribute, you may ADD your copyright to this notice.


define('PHPDIG_VERSION','1.8.9 RC1');            // no need to change


// error_reporting(0);                           // have PHP report no errors
// error_reporting(E_ALL);                       // have PHP report all errors


define('PHPDIG_ADM_AUTH','1');                   // activates/deactivates login
define('PHPDIG_ADM_USER','admin');               // login username
define('PHPDIG_ADM_PASS','admin');               // login password


$phpdig_language = "en";                         // language: ca, cs, da, de, en, es, fr, gr, it, nl, no, pt, ru
define('PHPDIG_ENCODING','utf-8');               // KEEP AS utf-8 !!!


// you may have to change this constant depending on the page encoding, for instance...
// define('DETECT_ORDER','UTF-8,ISO-8859-7,ASCII'); // or
// define('DETECT_ORDER','UTF-8,Windows-1251,ASCII'); // or
// define('DETECT_ORDER','UTF-8,BIG-5,ASCII'); // or
// define('DETECT_ORDER','UTF-8,JIS,KOI8-R,EUC-KR,EUC-JP,SJIS,BIG-5'); // etcetera
// the first non UTF-8 encoding in the constant that 'matches' the page is used in conversion to UTF-8.
// note that some pages can match multiple encodings even though only one encoding displays correctly.
// for example, if you do a search and see chinese characters in german text, the order is not correct.
// you may need to set/reset this constant, as there is no perfect ordering for all pages.
// furthermore, some encodings have multiple names (e.g., CP1251 like Windows-1251).
// if needed, edit the function phpdigMakeUTF8 in robot_functions.php to account for multiple names.

//---------CONVERT JAPANESE KANA (only for Japanese)

define('ENABLE_JPKANA',false);                   // activates/deactivates japanese kana conversion
define('CONVERT_JPKANA','KVa');                  // see
for options

//---------PATH SETTINGS

// note: if you want a different path, you need to add that path (relative path up to the
// admin directory: ../dir or full path up to the admin directory: /full/path/to/dir) in
// the first if statement in this config.php file - for example:
// && ($relative_script_path != "../dir") // relative path
// && ($relative_script_path != "/full/path/to/dir") // full path
// you may also need to set $relative_script_path to this path in search.php, clickstats.php,
// and function_phpdig_form.php depending on what files you are calling from where.
// note: double dot means go back one and single dot means stay in same directory
// note: the path should be UP TO but NOT INCLUDING the admin directory - NO ending slash

/***** example
* phpdig installed at:

* want search page at:

* copy

* copy

* set $relative_script_path = './phpdig'; in search.php, clickstats.php, and function_phpdig_form.php
* add ($relative_script_path != "./phpdig") && to if statement

// full path up to but not including the phpdig admin directory, no end slash

//---------SECURITY CHECK (first if statement in config file)

// this chunk of code NEEDS to be here for security - checks to see that $relative_script_path is set to a valid value
if ((!isset($relative_script_path)) || (($relative_script_path != ".") &&
   ($relative_script_path != "..") && ($relative_script_path != ABSOLUTE_SCRIPT_PATH))) {
   // echo "\n\nPath $relative_script_path not recognized!\n\n";


// note: if you receive an "undefined index" message that means that your server is not recognizing one or
// some of the $_SERVER variables so check your PHP info and set the $_SERVER variables to those recognized
// by your server: see
for a list. there are also $_SERVER variables
// in the custom_rss.php and custon_search.php files to prevent direct access to those files too. you could
// use "if (realpath(__FILE__) == realpath($_SERVER['SCRIPT_FILENAME'])) { exit(); }" instead, assuming that
// $_SERVER['SCRIPT_FILENAME'] is defined on your server.

if ((isset($_SERVER['SCRIPT_FILENAME'])) && (mb_eregi("config.php",$_SERVER['SCRIPT_FILENAME']))) {
if ((isset($_SERVER['SCRIPT_URI'])) && (mb_eregi("config.php",$_SERVER['SCRIPT_URI']))) {
if ((isset($_SERVER['SCRIPT_URL'])) && (mb_eregi("config.php",$_SERVER['SCRIPT_URL']))) {
if ((isset($_SERVER['REQUEST_URI'])) && (mb_eregi("config.php",$_SERVER['REQUEST_URI']))) {
if ((isset($_SERVER['SCRIPT_NAME'])) && (mb_eregi("config.php",$_SERVER['SCRIPT_NAME']))) {
if ((isset($_SERVER['PATH_TRANSLATED'])) && (mb_eregi("config.php",$_SERVER['PATH_TRANSLATED']))) {
if ((isset($_SERVER['PHP_SELF'])) && (mb_eregi("config.php",$_SERVER['PHP_SELF']))) {


// this chunk of code NEEDS to be here for security - checks to see that $template is set to a valid value
if (isset($_REQUEST['template_demo'])) {
    $template_demo = $_REQUEST['template_demo'];
$templates_array = array('black.html','bluegrey.html','corporate.html','green.html','grey.html','lightgreen.html','linear.html','newspaper.html','phpdig.html','simple.html','terminal.html','yellow.html','gaagle.html');
if(isset($template_demo) && in_array($template_demo, $templates_array)) {
    $template = "$relative_script_path/templates/$template_demo";
} else {
    $template = "$relative_script_path/templates/phpdig.html";

// alternatively force the $template value to a valid value
// $template = "$relative_script_path/templates/phpdig.html";
// if using array, set $template = "array";
// if using classic, set $template = "classic";

// now set $template_demo to a clean $template filename or empty string
if (($template != "array") && ($template != "classic")) {
    $template_demo = mb_substr($template,mb_strrpos($template,"/")+1); // get filename.ext from $template variable
} else {
    $template_demo = "";


define('SEARCH_PAGE','search.php');              // the name of the search page
define('SEARCH_DEFAULT_LIMIT',10);               // search results per page
define('LINK_TARGET','_blank');                  // target for result links

define('SEARCH_BOX_SIZE',15);                    // search box size
define('SEARCH_BOX_MAXLENGTH',50);               // search box maxlength

define('HIGHLIGHT_BACKGROUND','#FFBB00');        // highlighting background color, only for classic mode
define('HIGHLIGHT_COLOR','#000000');             // highlighting text color, only for classic mode

define('WEIGHT_IMGSRC','./tpl_img/weight.gif');  // baragraph image path
define('WEIGHT_HEIGHT','5');                     // baragraph height
define('WEIGHT_WIDTH','50');                     // max baragraph width

define('DISPLAY_DROPDOWN',true);                 // display dropdown on search page
define('DROPDOWN_URLS',true);                    // show URLs in dropdown: DISPLAY_DROPDOWN needs to be true

define('DISPLAY_SNIPPETS',true);                 // display text snippets
define('DISPLAY_SNIPPETS_NUM',4);                // max snippets to display
define('DISPLAY_SUMMARY',false);                 // display description

define('SNIPPET_DISPLAY_LENGTH',200);            // max chars displayed in each snippet
define('SUMMARY_DISPLAY_LENGTH',150);            // max chars displayed in summary
define('TITLE_DISPLAY_LENGTH',100);              // max chars displayed in title

define('PHPDIG_DATE_FORMAT','\1-\2-\3');         // date format for last update
                                                 // \1 is year, \2 month and \3 day
                                                 // if using rss, use date format \1-\2-\3

define('SEARCH_DEFAULT_MODE','start');           // default search mode (start|exact|any)
                                                 // start is AND OPERATOR, exact is EXACT PHRASE, and any is OR OPERATOR
                                                 // in language pack make the appropriate changes to 'w_begin', 'w_whole', and 'w_part'
                                                 // e.g., 'w_begin' => 'and operator', 'w_whole' => 'exact phrase', 'w_part' => 'or operator'

define('PHPDIG_LOGS',true);                      // write logs from searches for statistics
define('LOG_CLICKS',true);                       // log clicks from searches for statistics

define('NUMBER_OF_RESULTS_PER_SITE',-1);         // max number of search results per site
                                                 // use -1 to display all search results


define('LIST_ENABLE',true);                      // activates/deactivates listing of past queries
define('LIST_PAGE','list.php');                  // the name of the list page
define('LIST_NEW_WINDOW',1);                     // open queries in new window
define('LIST_SHOW_ZEROS',0);                     // show queries with zero results
define('LIST_DEFAULT_LIMIT',20);                 // listings per page - positive integer of ten - 10,20,30,...
define('LIST_META_TAG','<meta name="robots" content="noindex,nofollow">'); // meta tag for list page


define('TEXT_STORAGE_AMOUNT',10000);             // max characters per page to store in files/tables

define('TEXT_CONTENT_PATH','text_content/');     // path to text content files directory for indexed page content
define('CONTENT_TEXT',0);                        // activates/deactivates the storage of text content in files

define('SPIDER_MAX_LIMIT',20);                   // max (re)index search depth - used for shell and admin panel dropdown
define('RESPIDER_LIMIT',5);                      // max update search depth - only used for browser, not used for shell

define('LINKS_MAX_LIMIT',20);                    // max (re)index links per - used for shell and admin panel dropdown
define('RELINKS_LIMIT',5);                       // max update links per - only used for browser, not used for shell

define('LIMIT_TO_DIRECTORY',false);              // limit index to given (sub)directory where (sub)directories of give (sub)directory are NOT indexed
                                                 // for limit to directory, URL format must either have file at end or ending slash at end
                                                 // e.g.,
(WITH ending slash) or

define('ALLOW_SUBDIRECTORIES',false);            // limit index to given (sub)directory where (sub)directories of give (sub)directory are indexed
                                                 // if set to true, LIMIT_TO_DIRECTORY must also be set to true

define('LIMIT_DAYS',0);                          // default days before reindexing a page via admin panel or shell is allowed
                                                 // this does not automatically reindex - to auto reindex, you need to run a cron job

define('SMALL_WORDS_SIZE',2);                    // min size of word to not index - must be two or more
define('MAX_WORDS_SIZE',300);                    // max size of word to not index - words separated by spaces

define('PHPDIG_EXCLUDE_COMMENT','<!-- phpdigExclude -->');  // comment to exclude part of a page
define('PHPDIG_INCLUDE_COMMENT','<!-- phpdigInclude -->');  // comment to include part of a page
                                                            // comments must be on their own lines in the HTML source
                                                            // text within comments is not indexed
                                                            // links within comments are indexed

define('APPEND_TITLE_META',false);               // append title and meta information to indexed results
define('TITLE_WEIGHT',3);                        // relative title weight: APPEND_TITLE_META needs to be true

define('PHPDIG_SESSID_REMOVE',true);             // remove SIDs or variables from links being indexed
define('PHPDIG_SESSID_VAR','PHPSESSID,s');       // name of SID or variable to remove - cAsE sEnSiTiVe
                                                 // can be 's' or comma delimited 's,id,var,foo,etc'

define('PHPDIG_DEFAULT_INDEX',false);            // consider (index|default)\.(php|phtml|asp|htm|html)$ the same as /
                                                 // e.g.,
same as

define('PHPDIG_IN_DOMAIN',true);                 // jump hosts in the same domain
                                                 // e.g., if the host is
, the domain is

define('SILENCE_404S',true);                     // silence 404 output when indexing

define('TEMP_FILENAME_LENGTH',8);                // filename length of temp files that are created when indexing
                                                 // if using external tools with extension, use 4 for a filename of length 8

define("END_OF_LINE_MARKER","\r\n");             // end of line marker - keep double quotes

define('CHUNK_SIZE',1024);                       // pages are divided into chunks for processing
                                                 // chunk size for regex processing

define('USE_RENICE_COMMAND','1');                // use renice for process priority
                                                 // see
to learn about renice


define('USE_IS_EXECUTABLE_COMMAND','1');         // use PHP is_executable for external binaries
                                                 // if set to true, is_executable used
                                                 // set to '0' if is_executable is undefined

// note: chances are that you do not need to set any options, as phpdig should use the 'DETECT_ORDER' constant for encoding
// only set an extension if the external binary output is not STDOUT and a different extension is produced by the external binary
// e.g., use '.txt' (including the period) if the external binary writes output to filename.txt instead of piping output to STDOUT

define('PHPDIG_INDEX_MSWORD',false);                             // activate/deactivate
define('PHPDIG_PARSE_MSWORD','/usr/local/bin/catdoc');           // full path to external binary
define('PHPDIG_OPTION_MSWORD','');                               // external binary options, e.g., '-s utf-8'
define('PHPDIG_MSWORD_EXTENSION','');                            // only set if NOT STDOUT

define('PHPDIG_INDEX_PDF',false);                                // activate/deactivate
define('PHPDIG_PARSE_PDF','/usr/local/bin/pdftotext');           // full path to external binary
define('PHPDIG_OPTION_PDF','');                                  // external binary options, e.g., '-enc UTF-8'
define('PHPDIG_PDF_EXTENSION','.txt');                           // only set if NOT STDOUT

define('PHPDIG_INDEX_MSEXCEL',false);                            // activate/deactivate
define('PHPDIG_PARSE_MSEXCEL','/usr/local/bin/xls2csv');         // full path to external binary
define('PHPDIG_OPTION_MSEXCEL','');                              // external binary options, e.g., '-s utf-8'
define('PHPDIG_MSEXCEL_EXTENSION','');                           // only set if NOT STDOUT

define('PHPDIG_INDEX_MSPOWERPOINT',false);                       // activate/deactivate
define('PHPDIG_PARSE_MSPOWERPOINT','/usr/local/bin/ppt2text');   // full path to external binary
define('PHPDIG_OPTION_MSPOWERPOINT','');                         // external binary options, e.g., 'whatever'
define('PHPDIG_MSPOWERPOINT_EXTENSION','');                      // only set if NOT STDOUT


// note: make sure ABSOLUTE_SCRIPT_PATH is the full path up to but not including the admin dir, no ending slash
// note: CRON_ENABLE set to true writes a file at CRON_CONFIG_FILE containing the cron job information
// the CRON_CONFIG_FILE must be 777 permissions if applicable to your OS/setup.
// you still need to call the CRON_CONFIG_FILE to run the cron job !!!
// from shell: crontab CRON_CONFIG_FILE to set the cron job: replace CRON_CONFIG_FILE with actual file
// from shell: crontab -l to list and crontab -d to delete

define('CRON_ENABLE',false);                          // activates/deactivates creation of cron file
define('CRON_EXEC_FILE','/usr/bin/crontab');          // full path to crontab
define('CRON_CONFIG_FILE',ABSOLUTE_SCRIPT_PATH.'/admin/temp/cronfile.txt'); // where to write cron file
define('PHPEXEC','/usr/local/bin/php');               // full path to PHP

//---------FTP SETTINGS

define('FTP_ENABLE',0);                               // activate/deactivate ftp for distant indexing
define('FTP_HOST','<ftp host>');                      // if distant indexing, set the ftp host
define('FTP_PORT',21);                                // if distant indexing, set the ftp port
define('FTP_PASV',1);                                 // activates/deactivates passive mode
define('FTP_PATH','');      // distant path from the ftp root
define('FTP_TEXT_PATH','text_content');               // ftp path to the text content directory
define('FTP_USER','<ftp usename>');                   // ftp username
define('FTP_PASS','<ftp password>');                  // ftp password

//---------RSS SETTINGS

define('ALLOW_RSS_FEED',false);                       // activate/deactivate feed - if true, set rss directory to 777 permissions if applicable
$theenc = PHPDIG_ENCODING;                            // needs to be same encoding used in index - do not change !!!
$theurl = "
";                   // site offering the rss feed
$thetitle = "";                             // title for site offering the rss feed
$thedesc = "PhpDig :: Web Spider and Search Engine";  // description of site offering the rss feed
$thedir = "./rss";                                    // the rss directory name, no ending slash
$thefile = "search.rss";                              // value used in rss filenames


// regexp for forbidden links - some links may return text/html mime-type but should not be indexed so forbid them !!!
// you can also expand the FORBIDDEN value by writing a regex to forbid certain links containing 'word' from being indexed


// character class MUST have "[ characters go in here ]*" format
// do NOT enter [ or ] in character class - blackslash other special characters
// see
for further info
// $allowed_link_chars = "[:%/?=&;\\,._a-zA-Z0-9|+ ()~-]*"; // includes space and () but not good with javascript
$allowed_link_chars = "[:%/?=&;\\,._a-zA-Z0-9|+~-]*";

//----------MONTH NAMES

// month names in iso dates
$month_names = array ('jan'=>1,


// apache fancy indexing queries to not follow
$apache_indexes = array (  "?N=A" => 1,
                           "?N=D" => 1,
                           "?M=A" => 1,
                           "?M=D" => 1,
                           "?S=A" => 1,
                           "?S=D" => 1,
                           "?D=A" => 1,
                           "?D=D" => 1,
                           "?C=N&amp;O=A" => 1,
                           "?C=M&amp;O=A" => 1,
                           "?C=S&amp;O=A" => 1,
                           "?C=D&amp;O=A" => 1,
                           "?C=N&amp;O=D" => 1,
                           "?C=M&amp;O=D" => 1,
                           "?C=S&amp;O=D" => 1,
                           "?C=D&amp;O=D" => 1);


// check to make sure a language is set
if (!isset($phpdig_language)) {
    $phpdig_language = "en";

// include a language file
define('PHPDIG_LANG_CONSTANT',$phpdig_language); // this line is needed for classic
if (is_file("$relative_script_path/locales/$phpdig_language-language.php")) {
    include "$relative_script_path/locales/$phpdig_language-language.php";
elseif (is_file("$relative_script_path/locales/en-language.php")) {
    include "$relative_script_path/locales/en-language.php";
else {
    die("Unable to select language pack.\n");

// connect to database
if ((!isset($no_connect)) || ($no_connect != 1)) {
    if (is_file("$relative_script_path/includes/connect.php")) {
        include "$relative_script_path/includes/connect.php";
    else {
        die("Unable to find connect.php file.\n");

// include the libraries
if (is_file("$relative_script_path/libs/phpdig_functions.php")) {
    include "$relative_script_path/libs/phpdig_functions.php";
else {
    die ("Unable to find phpdig_functions.php file.\n");
if (is_file("$relative_script_path/libs/function_phpdig_form.php")) {
    include "$relative_script_path/libs/function_phpdig_form.php";
else {
    die ("Unable to find function_phpdig_form.php file.\n");
if (is_file("$relative_script_path/libs/mysql_functions.php")) {
    include "$relative_script_path/libs/mysql_functions.php";
else {
    die ("Unable to find mysql_functions.php file.\n");

// check the template value
if ((!isset($template)) || ((!is_file($template)) && ($template != "array") && ($template != "classic"))) {
    die ("Unable to render template file.\n");

// send encoding if needed
if (!headers_sent()) {
   header('Content-type:text/html; Charset='.PHPDIG_ENCODING);

// turn off magic_quotes_runtime for escaping purposes

// turn off magic_quotes_sybase for escaping purposes

// check that the tables exist
if ((!isset($no_connect)) || ($no_connect != 1)) {

on peut maider ???