Page 1 of 1

HTML filtering

Posted: 09 Oct 2009, 14:27
by Nero666
For some unknown reason, lots of 404s that were caught by phoca SEF included HTML codes and attributes, and sometimes even javascript code, in their URLs - though all of this html looked like it came from joomla in some way.
The biggest problem was that the HTML got used in the 404 listing (index.php?option=com_phocasef&view=phocasefurls) and that the page wasn't functioning anymore. After manually taking out the unwanted content from my database, phocasef was working again.

I would advise to include a filter for html and javascript in the 404 catching algorithm, so that this can't occur anymore.

An example:
templates/ja_purity/styleForums</span></a><ul><li class=

Re: HTML filtering

Posted: 11 Oct 2009, 21:55
by Jan
Hi, sorry I don't understand what you mean and what is the problem? Phoca SEF only redirects obsolete sites

Re: HTML filtering

Posted: 10 Nov 2010, 22:14
by pitu
Hi,
same problem.

I don't know how "born" these urls ( users' or administrators' URL mistakes, may be a some Joomla bug, or a hacked url) but these come from Joomla (it's not a problem Phoca SEF).

Examples from database:

Code: Select all

INSERT INTO `jos_phocasef_url` (`id`, `cw`, `cr`, `new_url`, `old_url`, `date_url`, `published`, `ordering`) VALUES
(251, 2, 0, '', 'templates/sy     </td>    </tr></table></div><div class=', '2010-11-01 13:46:31', 0, 0),
(253, 1, 0, '', '<div id="logo"><!-- <a href=', '2010-11-01 13:55:34', 0, 0),
(301, 2, 0, '', 'modules/coppermine/themes/coppercop/theme.php?THEME_DIR=|echo "casper";echo "kae";|', '2010-11-05 02:54:04', 0, 0),
(304, 1, 0, '', 'template�cii.</p><p align=', '2010-11-05 09:06:19', 0, 0),
(333, 1, 0, '', 'templates/system/css/system> Aktuality: </div><table class=', '2010-11-09 10:38:29', 0, 0),
(335, 1, 0, '', 'tem. 5. 2010        </td>    </tr></table></div><div class=', '2010-11-09 10:40:28', 0, 0),
(336, 1, 0, '', 'templates/sys: left;', '2010-11-09 10:40:34', 0, 0),
(338, 1, 0, '', 'templates/sunshine_1/js/debugp><strong>Obrancovia:</strong> Roman Mas<strong>Záložníci:</strong', '2010-11-09 15:31:50', 0, 0);
and some examples from Phoca SEF back-end (depend on bad urls):

Image

Image

I think, it is problem that Phoca SEF not ignore them.
Yes, these URLs may be deleted from databasae, but - may be it is a good Nero666's idea ignoring (filtering) them and not storing in Phoca SEF tables (or storing as "not valid URLs" in another table and show another way in back-end?)

Re: HTML filtering

Posted: 12 Nov 2010, 22:22
by Jan
Hi, I will take a look at it, if I find some problem there, I will fix it and release in new version.

Jan