syndrael
Messages postés2378Date d'inscriptionlundi 4 février 2002StatutMembreDernière intervention29 décembre 201220 20 nov. 2006 à 16:19
Dans toutes les bonnes boucheries.. Google est ton ami --> PhpDig et hop..
www.framasoft.net/article2165.html
www.phpdig.net/
www.phpdig.net/francaise.php?action=doc
www.comscripts.com/scripts/php.phpdig.596.html
astuces_jeux
Messages postés731Date d'inscriptionmercredi 15 novembre 2000StatutMembreDernière intervention27 mai 2010 20 nov. 2006 à 16:49
sa daccord mais sa marche comment PHPDig ??? en fait moi je me suis servi du code rottmanh's search ou un truc comme sa et sa est pret mais en fait y a pas un logiciel qui envoi toutes les pages vues sur la base de données ??? ou pour ajouter toutes les pages ??? pour mon moteur de recherche web mais si PHPDig fait ce que je demandes avec un autre moteur de recherche moi je veux bien !!!
Vous n’avez pas trouvé la réponse que vous recherchez ?
syndrael
Messages postés2378Date d'inscriptionlundi 4 février 2002StatutMembreDernière intervention29 décembre 201220 20 nov. 2006 à 18:32
pourkoa faut-il ke la page soit vue pour être dans la base de données ? Es-tu sur de vouloir un moteur de recherche ?
Le principe d'un moteur de recherche est lors de la CREATION de ta page
de l'indexer, ça veut dire ke ça la découpe en autant de mots qui
pourraient être demandés dans ton moteur.
syndrael
Messages postés2378Date d'inscriptionlundi 4 février 2002StatutMembreDernière intervention29 décembre 201220 20 nov. 2006 à 18:41
Le plus simpe est ke tu installe PhpDig et tu verras l'interface
d'administration, là tu remarqueras une fonciton ki te permet d'indexer
tes pages à partir d'un emplacement de ton site..
astuces_jeux
Messages postés731Date d'inscriptionmercredi 15 novembre 2000StatutMembreDernière intervention27 mai 2010 21 nov. 2006 à 09:24
oui daccord mais je l'install puis dans _connect.php ou un truc comme sa (dans include) je met les infos de ma base puis je met sur internet par ftp,
et je tape http://monsite/phpdig/admin/install.php et sa me met :
Fatal error: Call to undefined function: mb_eregi() in /phpdig/includes/config.php on line 109
erreur a la ligne 109 qui est :
if
((isset($
_SERVER['SCRIPT_FILENAME'])) &&
(mb_eregi("config.php",$
_SERVER['SCRIPT_FILENAME']))) {
voici le code de config.php :
<?php
/*
----------------------------------------------------------------------------------
PhpDig Version 1.8.x - See the config file for the full version number.
This program is provided WITHOUT warranty under the GNU/GPL license.
See the LICENSE file for more information about the GNU/GPL license.
Contributors are listed in the CREDITS and CHANGELOG files in this package.
Developer from inception to and including PhpDig v.1.6.2: Antoine Bajolet
Developer from PhpDig v.1.6.3 to and including current version: Charter
Copyright (C) 2001 - 2003, Antoine Bajolet,
Contributors hold Copyright (C) to their code submissions.
Do NOT edit or remove this copyright or licence information upon redistribution.
If you modify code and redistribute, you may ADD your copyright to this notice.
----------------------------------------------------------------------------------
*/
/***********************************************************************************************************************/
//--------PHPDIG VERSION
define('PHPDIG_VERSION','1.8.9 RC1'); // no need to change
/***********************************************************************************************************************/
//--------LANGUAGE AND ENCODING
$phpdig_language = "en"; // language: ca, cs, da, de, en, es, fr, gr, it, nl, no, pt, ru
define('PHPDIG_ENCODING','utf-8'); // KEEP AS utf-8 !!!
/***********************************************************************************************************************/
//----------DETECT ORDER FOR PHP AUTO DETECT ENCODING
// you may have to change this constant depending on the page encoding, for instance...
// define('DETECT_ORDER','UTF-8,ISO-8859-7,ASCII'); // or
// define('DETECT_ORDER','UTF-8,Windows-1251,ASCII'); // or
// define('DETECT_ORDER','UTF-8,BIG-5,ASCII'); // or
// define('DETECT_ORDER','UTF-8,JIS,KOI8-R,EUC-KR,EUC-JP,SJIS,BIG-5'); // etcetera
// the first non UTF-8 encoding in the constant that 'matches' the page is used in conversion to UTF-8.
// note that some pages can match multiple encodings even though only one encoding displays correctly.
// for example, if you do a search and see chinese characters in german text, the order is not correct.
// you may need to set/reset this constant, as there is no perfect ordering for all pages.
// furthermore, some encodings have multiple names (e.g., CP1251 like Windows-1251).
// if needed, edit the function phpdigMakeUTF8 in robot_functions.php to account for multiple names.
define('DETECT_ORDER','UTF-8,KOI8-R,JIS,SJIS,CP936,BIG-5,EUC-CN,EUC-TW,EUC-KR,EUC-JP');
/***********************************************************************************************************************/
//---------CONVERT JAPANESE KANA (only for Japanese)
// note: if you want a different path, you need to add that path (relative path up to the
// admin directory: ../dir or full path up to the admin directory: /full/path/to/dir) in
// the first if statement in this config.php file - for example:
// && ($relative_script_path != "../dir") // relative path
// && ($relative_script_path != "/full/path/to/dir") // full path
// you may also need to set $relative_script_path to this path in search.php, clickstats.php,
// and function_phpdig_form.php depending on what files you are calling from where.
// note: double dot means go back one and single dot means stay in same directory
// note: the path should be UP TO but NOT INCLUDING the admin directory - NO ending slash
* set $relative_script_path = './phpdig'; in search.php, clickstats.php, and function_phpdig_form.php
* add ($relative_script_path != "./phpdig") && to if statement
*****/
// full path up to but not including the phpdig admin directory, no end slash
define('ABSOLUTE_SCRIPT_PATH','/full/path/to/dir');
/***********************************************************************************************************************/
//---------SECURITY CHECK (first if statement in config file)
// this chunk of code NEEDS to be here for security - checks to see that $relative_script_path is set to a valid value
if ((!isset($relative_script_path)) || (($relative_script_path != ".") &&
($relative_script_path != "..") && ($relative_script_path != ABSOLUTE_SCRIPT_PATH))) {
// echo "\n\nPath $relative_script_path not recognized!\n\n";
exit();
}
/***********************************************************************************************************************/
//---------DENY DIRECT ACCESS TO CONFIG FILE
// note: if you receive an "undefined index" message that means that your server is not recognizing one or
// some of the $_SERVER variables so check your PHP info and set the $_SERVER variables to those recognized
// by your server: see
http://www.php.net/reserved.variables for a list. there are also $_SERVER variables
// in the custom_rss.php and custon_search.php files to prevent direct access to those files too. you could
// use "if (realpath(__FILE__) == realpath($_SERVER['SCRIPT_FILENAME'])) { exit(); }" instead, assuming that
// $_SERVER['SCRIPT_FILENAME'] is defined on your server.
if ((isset($_SERVER['SCRIPT_FILENAME'])) && (mb_eregi("config.php",$_SERVER['SCRIPT_FILENAME']))) {
exit();
}
if ((isset($_SERVER['SCRIPT_URI'])) && (mb_eregi("config.php",$_SERVER['SCRIPT_URI']))) {
exit();
}
if ((isset($_SERVER['SCRIPT_URL'])) && (mb_eregi("config.php",$_SERVER['SCRIPT_URL']))) {
exit();
}
if ((isset($_SERVER['REQUEST_URI'])) && (mb_eregi("config.php",$_SERVER['REQUEST_URI']))) {
exit();
}
if ((isset($_SERVER['SCRIPT_NAME'])) && (mb_eregi("config.php",$_SERVER['SCRIPT_NAME']))) {
exit();
}
if ((isset($_SERVER['PATH_TRANSLATED'])) && (mb_eregi("config.php",$_SERVER['PATH_TRANSLATED']))) {
exit();
}
if ((isset($_SERVER['PHP_SELF'])) && (mb_eregi("config.php",$_SERVER['PHP_SELF']))) {
exit();
}
// this chunk of code NEEDS to be here for security - checks to see that $template is set to a valid value
if (isset($_REQUEST['template_demo'])) {
$template_demo = $_REQUEST['template_demo'];
}
$templates_array = array('black.html','bluegrey.html','corporate.html','green.html','grey.html','lightgreen.html','linear.html','newspaper.html','phpdig.html','simple.html','terminal.html','yellow.html','gaagle.html');
if(isset($template_demo) && in_array($template_demo, $templates_array)) {
$template = "$relative_script_path/templates/$template_demo";
} else {
$template = "$relative_script_path/templates/phpdig.html";
}
// alternatively force the $template value to a valid value
// $template = "$relative_script_path/templates/phpdig.html";
// if using array, set $template = "array";
// if using classic, set $template = "classic";
// now set $template_demo to a clean $template filename or empty string
if (($template != "array") && ($template != "classic")) {
$template_demo = mb_substr($template,mb_strrpos($template,"/")+1); // get filename.ext from $template variable
} else {
$template_demo = "";
}
define('SEARCH_PAGE','search.php'); // the name of the search page
define('SEARCH_DEFAULT_LIMIT',10); // search results per page
define('LINK_TARGET','_blank'); // target for result links
define('HIGHLIGHT_BACKGROUND','#FFBB00'); // highlighting background color, only for classic mode
define('HIGHLIGHT_COLOR','#000000'); // highlighting text color, only for classic mode
define('DISPLAY_DROPDOWN',true); // display dropdown on search page
define('DROPDOWN_URLS',true); // show URLs in dropdown: DISPLAY_DROPDOWN needs to be true
define('DISPLAY_SNIPPETS',true); // display text snippets
define('DISPLAY_SNIPPETS_NUM',4); // max snippets to display
define('DISPLAY_SUMMARY',false); // display description
define('SNIPPET_DISPLAY_LENGTH',200); // max chars displayed in each snippet
define('SUMMARY_DISPLAY_LENGTH',150); // max chars displayed in summary
define('TITLE_DISPLAY_LENGTH',100); // max chars displayed in title
define('PHPDIG_DATE_FORMAT','\1-\2-\3'); // date format for last update
// \1 is year, \2 month and \3 day
// if using rss, use date format \1-\2-\3
define('SEARCH_DEFAULT_MODE','start'); // default search mode (start|exact|any)
// start is AND OPERATOR, exact is EXACT PHRASE, and any is OR OPERATOR
// in language pack make the appropriate changes to 'w_begin', 'w_whole', and 'w_part'
// e.g., 'w_begin' => 'and operator', 'w_whole' => 'exact phrase', 'w_part' => 'or operator'
define('PHPDIG_LOGS',true); // write logs from searches for statistics
define('LOG_CLICKS',true); // log clicks from searches for statistics
define('NUMBER_OF_RESULTS_PER_SITE',-1); // max number of search results per site
// use -1 to display all search results
define('LIST_ENABLE',true); // activates/deactivates listing of past queries
define('LIST_PAGE','list.php'); // the name of the list page
define('LIST_NEW_WINDOW',1); // open queries in new window
define('LIST_SHOW_ZEROS',0); // show queries with zero results
define('LIST_DEFAULT_LIMIT',20); // listings per page - positive integer of ten - 10,20,30,...
define('LIST_META_TAG','<meta name="robots" content="noindex,nofollow">'); // meta tag for list page
define('TEXT_STORAGE_AMOUNT',10000); // max characters per page to store in files/tables
define('TEXT_CONTENT_PATH','text_content/'); // path to text content files directory for indexed page content
define('CONTENT_TEXT',0); // activates/deactivates the storage of text content in files
define('SPIDER_MAX_LIMIT',20); // max (re)index search depth - used for shell and admin panel dropdown
define('RESPIDER_LIMIT',5); // max update search depth - only used for browser, not used for shell
define('LINKS_MAX_LIMIT',20); // max (re)index links per - used for shell and admin panel dropdown
define('RELINKS_LIMIT',5); // max update links per - only used for browser, not used for shell
define('LIMIT_TO_DIRECTORY',false); // limit index to given (sub)directory where (sub)directories of give (sub)directory are NOT indexed
// for limit to directory, URL format must either have file at end or ending slash at end
// e.g.,
http://www.domain.com/dirs/ (WITH ending slash) or
http://www.domain.com/dirs/dirs/index.php
define('ALLOW_SUBDIRECTORIES',false); // limit index to given (sub)directory where (sub)directories of give (sub)directory are indexed
// if set to true, LIMIT_TO_DIRECTORY must also be set to true
define('LIMIT_DAYS',0); // default days before reindexing a page via admin panel or shell is allowed
// this does not automatically reindex - to auto reindex, you need to run a cron job
define('SMALL_WORDS_SIZE',2); // min size of word to not index - must be two or more
define('MAX_WORDS_SIZE',300); // max size of word to not index - words separated by spaces
define('PHPDIG_EXCLUDE_COMMENT','<!-- phpdigExclude -->'); // comment to exclude part of a page
define('PHPDIG_INCLUDE_COMMENT','<!-- phpdigInclude -->'); // comment to include part of a page
// comments must be on their own lines in the HTML source
// text within comments is not indexed
// links within comments are indexed
define('APPEND_TITLE_META',false); // append title and meta information to indexed results
define('TITLE_WEIGHT',3); // relative title weight: APPEND_TITLE_META needs to be true
define('PHPDIG_SESSID_REMOVE',true); // remove SIDs or variables from links being indexed
define('PHPDIG_SESSID_VAR','PHPSESSID,s'); // name of SID or variable to remove - cAsE sEnSiTiVe
// can be 's' or comma delimited 's,id,var,foo,etc'
define('PHPDIG_IN_DOMAIN',true); // jump hosts in the same domain
// e.g., if the host is
www.domain.com , the domain is domain.com
define('SILENCE_404S',true); // silence 404 output when indexing
define('TEMP_FILENAME_LENGTH',8); // filename length of temp files that are created when indexing
// if using external tools with extension, use 4 for a filename of length 8
define("END_OF_LINE_MARKER","\r\n"); // end of line marker - keep double quotes
define('CHUNK_SIZE',1024); // pages are divided into chunks for processing
// chunk size for regex processing
define('USE_IS_EXECUTABLE_COMMAND','1'); // use PHP is_executable for external binaries
// if set to true, is_executable used
// set to '0' if is_executable is undefined
// note: chances are that you do not need to set any options, as phpdig should use the 'DETECT_ORDER' constant for encoding
// only set an extension if the external binary output is not STDOUT and a different extension is produced by the external binary
// e.g., use '.txt' (including the period) if the external binary writes output to filename.txt instead of piping output to STDOUT
define('PHPDIG_INDEX_MSWORD',false); // activate/deactivate
define('PHPDIG_PARSE_MSWORD','/usr/local/bin/catdoc'); // full path to external binary
define('PHPDIG_OPTION_MSWORD',''); // external binary options, e.g., '-s utf-8'
define('PHPDIG_MSWORD_EXTENSION',''); // only set if NOT STDOUT
define('PHPDIG_INDEX_PDF',false); // activate/deactivate
define('PHPDIG_PARSE_PDF','/usr/local/bin/pdftotext'); // full path to external binary
define('PHPDIG_OPTION_PDF',''); // external binary options, e.g., '-enc UTF-8'
define('PHPDIG_PDF_EXTENSION','.txt'); // only set if NOT STDOUT
define('PHPDIG_INDEX_MSEXCEL',false); // activate/deactivate
define('PHPDIG_PARSE_MSEXCEL','/usr/local/bin/xls2csv'); // full path to external binary
define('PHPDIG_OPTION_MSEXCEL',''); // external binary options, e.g., '-s utf-8'
define('PHPDIG_MSEXCEL_EXTENSION',''); // only set if NOT STDOUT
define('PHPDIG_INDEX_MSPOWERPOINT',false); // activate/deactivate
define('PHPDIG_PARSE_MSPOWERPOINT','/usr/local/bin/ppt2text'); // full path to external binary
define('PHPDIG_OPTION_MSPOWERPOINT',''); // external binary options, e.g., 'whatever'
define('PHPDIG_MSPOWERPOINT_EXTENSION',''); // only set if NOT STDOUT
// note: make sure ABSOLUTE_SCRIPT_PATH is the full path up to but not including the admin dir, no ending slash
// note: CRON_ENABLE set to true writes a file at CRON_CONFIG_FILE containing the cron job information
// the CRON_CONFIG_FILE must be 777 permissions if applicable to your OS/setup.
// you still need to call the CRON_CONFIG_FILE to run the cron job !!!
// from shell: crontab CRON_CONFIG_FILE to set the cron job: replace CRON_CONFIG_FILE with actual file
// from shell: crontab -l to list and crontab -d to delete
define('CRON_ENABLE',false); // activates/deactivates creation of cron file
define('CRON_EXEC_FILE','/usr/bin/crontab'); // full path to crontab
define('CRON_CONFIG_FILE',ABSOLUTE_SCRIPT_PATH.'/admin/temp/cronfile.txt'); // where to write cron file
define('PHPEXEC','/usr/local/bin/php'); // full path to PHP
define('FTP_ENABLE',0); // activate/deactivate ftp for distant indexing
define('FTP_HOST','<ftp host>'); // if distant indexing, set the ftp host
define('FTP_PORT',21); // if distant indexing, set the ftp port
define('FTP_PASV',1); // activates/deactivates passive mode
define('FTP_PATH',''); // distant path from the ftp root
define('FTP_TEXT_PATH','text_content'); // ftp path to the text content directory
define('FTP_USER','<ftp usename>'); // ftp username
define('FTP_PASS','<ftp password>'); // ftp password
define('ALLOW_RSS_FEED',false); // activate/deactivate feed - if true, set rss directory to 777 permissions if applicable
$theenc = PHPDIG_ENCODING; // needs to be same encoding used in index - do not change !!!
$theurl = "
http://www.phpdig.net/ "; // site offering the rss feed
$thetitle = "PhpDig.net"; // title for site offering the rss feed
$thedesc = "PhpDig :: Web Spider and Search Engine"; // description of site offering the rss feed
$thedir = "./rss"; // the rss directory name, no ending slash
$thefile = "search.rss"; // value used in rss filenames
// regexp for forbidden links - some links may return text/html mime-type but should not be indexed so forbid them !!!
// you can also expand the FORBIDDEN value by writing a regex to forbid certain links containing 'word' from being indexed
define('FORBIDDEN','\.(js|inc|rm|ico|cab|swf|css|gz|z|tar|zip|tgz|msi|arj|zoo|rar|r[0-9]+|exe|bin|pkg|rpm|deb|bz2)$');
/***********************************************************************************************************************/
//----------CHARACTER CLASS CONTAINING CHARACTERS ALLOWED IN LINKS
// character class MUST have "[ characters go in here ]*" format
// do NOT enter [ or ] in character class - blackslash other special characters
// see
http://www.php.net/manual/en/reference.pcre.pattern.syntax.php for further info
// $allowed_link_chars = "[:%/?=&;\\,._a-zA-Z0-9|+ ()~-]*"; // includes space and () but not good with javascript
$allowed_link_chars = "[:%/?=&;\\,._a-zA-Z0-9|+~-]*";
/***********************************************************************************************************************/
//----------APACHE INDEX PAGES
/***********************************************************************************************************************/
//----------NOTHING TO CHANGE BELOW THIS LINE
// check to make sure a language is set
if (!isset($phpdig_language)) {
$phpdig_language = "en";
}
// include a language file
define('PHPDIG_LANG_CONSTANT',$phpdig_language); // this line is needed for classic
if (is_file("$relative_script_path/locales/$phpdig_language-language.php")) {
include "$relative_script_path/locales/$phpdig_language-language.php";
}
elseif (is_file("$relative_script_path/locales/en-language.php")) {
include "$relative_script_path/locales/en-language.php";
}
else {
die("Unable to select language pack.\n");
}
// connect to database
if ((!isset($no_connect)) || ($no_connect != 1)) {
if (is_file("$relative_script_path/includes/connect.php")) {
include "$relative_script_path/includes/connect.php";
}
else {
die("Unable to find connect.php file.\n");
}
}
// include the libraries
if (is_file("$relative_script_path/libs/phpdig_functions.php")) {
include "$relative_script_path/libs/phpdig_functions.php";
}
else {
die ("Unable to find phpdig_functions.php file.\n");
}
if (is_file("$relative_script_path/libs/function_phpdig_form.php")) {
include "$relative_script_path/libs/function_phpdig_form.php";
}
else {
die ("Unable to find function_phpdig_form.php file.\n");
}
if (is_file("$relative_script_path/libs/mysql_functions.php")) {
include "$relative_script_path/libs/mysql_functions.php";
}
else {
die ("Unable to find mysql_functions.php file.\n");
}
// check the template value
if ((!isset($template)) || ((!is_file($template)) && ($template != "array") && ($template != "classic"))) {
die ("Unable to render template file.\n");
}
// send encoding if needed
if (!headers_sent()) {
header('Content-type:text/html; Charset='.PHPDIG_ENCODING);
}
// turn off magic_quotes_runtime for escaping purposes
@ini_set('magic_quotes_runtime',false);
// turn off magic_quotes_sybase for escaping purposes
@ini_set('magic_quotes_sybase',false);
// check that the tables exist
if ((!isset($no_connect)) || ($no_connect != 1)) {
phpdigCheckTables($id_connect,array('engine',
'excludes',
'keywords',
'sites',
'spider',
'tempspider',
'logs',
'clicks',
'site_page',
'includes'));
}
?>