May 14th, 2009
Web developers working with Indian devanagari language may face some issues while pulling out strings from the MySQL database. If not done correctly, one will get a series of question marks instead of the devanagari.
Solution:
Set the ‘collation’ of the database, table and column to ‘utf8_unicode_ci‘. Now, whenever you want to retrieve the unicode string, simply run the below given query before running the actual query to retrieve the unicode string.
mysql_query("SET NAMES 'utf8'");//run this query first
$resource=mysql_query('SELECT book_title FROM hindi_marathi_books'); |
Thats it! Now use PHP to parse the strings as you like!
1 Comment |
PHP-MySQL |
Permalink
Posted by Rohan Shenoy
February 9th, 2009
Validating email addresses is one of the common uses of regular expressions. However, not all email addresses are in the typical format of “someone@somewhere.com”. With the frequent advent of new domain extensions, regular expressions may fail to validate certain offbeat email address patterns.
If you are using PHP 5.2 or higher, you can use an inbuilt function for that:
filter_var('bob@example.com', FILTER_VALIDATE_EMAIL);
//returns true |
Apart from email address validation, some more filters can be found. A list of 17 available filters can found on PHP’s offical manual page
Update: Mmerlinn–a member at Digitl Point forums has alerted me that FILTER_VALIDATE_EMAIL is not RFC2822 compliant, and is vulnerable to new-line injection.
No Comments » |
PHP-MySQL |
Permalink
Posted by Rohan Shenoy
July 12th, 2008
Regular expressions by themselves are difficult for many to implement. An error that many of the regex newbies will encounter is:
Warning: preg_match() [function.preg-match]: Unknown modifier ‘/’ in ….. on line …
Solution
This error occurs whenever the pattern you are looking for contains a literal forward-slash. You should escape every literal forward-slash with a backslash.
For those of you interested in knowing more
The syntax for preg_match() is
int preg_match ( string $pattern , string $subject [, array &$matches [, int $flags [, int $offset ]]] ) |
Example of a pattern to search for alphabetic substrings in a string would be:
In the above, pattern example, the two forward slashes you see are the delimiters. The PHP parser is programmed to understand that anything in-between those two delimiters is the actual pattern we are looking for. However, the pattern we just used above will look only for lowercase alphabets. If we want PHP to look for even upper case or mixed case substrings, we need to inform the PHP parser accordingly.
All such additional instructions are provided to the PHP parser using entities which are called as ‘modifiers’. These modifiers are placed at the end of the pattern, after the delimiting forward-slash. Look below to see how we place the modifier named ‘i’ to instruct PHP parser to do a case-insensitive pattern matching
Any character appearing after the end delimiting forward slash is ‘assumed’ by the PHP parser as a delimited. A number of delimiters are defined by PHP. A full list of modifiers can be found here.
If you forget to escape a literal forward-slash in your pattern parameter, PHP assumes that the pattern has been ended by the delimiter and characters after that forward-slash are ‘modifiers’. When the parser finds that the modifiers are not defined by PHP, its outputs the error saying ‘Unknown modifier’.
No Comments » |
PHP-MySQL |
Permalink
Posted by Rohan Shenoy
June 7th, 2008
PHP-MySQL is great server side duo. But when designing an application, utmost care needs to be observed. With time, most web programmers learn about common things such as cannot send headers, mysql_query() is not a valid resource, unexpected T_IF VARIABLE, etc. However, some mistakes are less known and less publicized which leads to repetition of these mistakes.
- Trusting a cookie: While most programmers will validate GET and POST data, they will often forget to validate COOKIE data. It is easy to forget cookie data because cookies are not so interactive or conspicuous as a form that is supposed to collect GET or POST data. Always use the following 2 functions when the data is supposed to reside in the database: mysql_real_escape_string() and htmlspecialchars().
- File uploads: Uploaded files should never be stored with their original name and extension. The file name can be a randomly generated string that the user is not aware of. Store the files with an extension which the server is not aware of. Eg: A typical good file manager script will store the uploader file with a name such as hn32hyy3h4oiuc828y3h21ioe409u.ATTACHMENT. Don’t generate this random file name string using the md5() or sha1() or mh5_file() or sha1_file() functions. While serving a download, don’t expose the upload folder on your website. Instead use a download script which commonly takes its inputs like: download.php?file=11e561vc5sdf
- Error reporting: While error reporting is the only rapid solution to get a clue about errors, it should be switched off on the production server. Obviously, it is needed on the development server though. I have come across multiple websites that generate lots of error and allow the public to read those error lines. You may invite full path disclosure exploits. Use error_reporting(0) to prevent exhibition of errors. The question now is ‘If errors are not displayed how would I know where the error is?’. You can check out your error_log file to get an answer for that question. Remember to check the error_log file in the same directory as the script because every directory has its own error_log file.
- Timing out of scripts: While it is expected that scripts will do their function within a stipulated amount of time, its not always true. The script can take longer than expected time and may even exceed the tile limit and thus not generating any useful output. Mailing scripts, search scripts and other mass action scripts are more prone to such weaknesses. It can be overcome by time limits over execution using the set_time_limit() function.
- Editing php.ini file: Whenever possible, avoid editing php.ini file. Not all programmers really look deep into it to check for non-default values and assume default values. So if you change programmers in between a project or the new programmer is unaware of the variations in php.ini file, your application may develop ’security holes’. If you need to change the settings for a particular script or few of them, always prefer using PHP options and information functions.
2 Comments |
PHP-MySQL |
Permalink
Posted by Rohan Shenoy
June 6th, 2008
When you DELETE row(s) form a MySQL table, it creates an unoccupied empty space in the table. MySQL database remembers this extra space and whenever there is an INSERT, instead of creating new space for the newly inserted data, it uses that unoccupied empty space for the new insert. An empty space is a like non-profit asset for the MySQL database since it has to remember it even though it does not contain any data.
If you want it simpler, then this is how I explain: Assume you have 5 houses which you have rented out to your tenants. If one of the tenants vacates the house, you will get no income from it while you will have to still spend money to maintain it. This extra expenditure on a non-performing asset is termed as overhead.
The simplest way to eliminate this overhead is to fill new data into the empty space. However, if the overload is high, you should consider eliminating the empty space. This can be done by a query to optimize the table.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| <?php
mysql_connect($host,$username,$password);
mysql_select_db($db);
$table_name="Your_table_name_here"
$sql="OPTIMIZE TABLE '$table_name'";
if(mysql_query($sql))
{
echo "The $table_name has been optimized";
}
else
{
echo "Failed to optimize $table_name ";
}
?> |
Alternatively, you can do the same optimization using PhpMyAdmin. If over head is present, it is displayed in the ‘Structure’ tab with a ‘optimize table’ link adjacent to it.
No Comments » |
PHP-MySQL |
Permalink
Posted by Rohan Shenoy