PDA

View Full Version : [OT] mySQL Problem



aKe`cj
08-03-08, 01:13
Since I know there's a few quite skilled programmers who frequent this board, maybe someone can help me with a rather annoying issue:

The tables involved for the query are



dict_de dict_en de_en
+-------+--------+ +-------+--------+ +----------------+
| id | word | | id | word | | de | en |
+-------+--------+ +-------+--------+ +----------------+
| 8 | Jacke | | 15 | jacket | | 8 | 15 |
+-------+--------+ +-------+--------+ +----------------+


The first query (below) will return one result, as expected.
The second however, returns ZERO. Both queries are identical apart from the search term.


SELECT g.word AS gword, e.word AS eword, e.id AS eid, g.id AS gid, i.rating
FROM dict_en AS e
JOIN de_en AS i ON ( i.en = e.id )
JOIN dict_de AS g ON i.de = g.id
WHERE (
g.word REGEXP '^(Jacke)$'
AND g.id = i.de
)
OR (
e.word REGEXP '^(Jacke)$'
AND e.id = i.en
)



SELECT g.word AS gword, e.word AS eword, e.id AS eid, g.id AS gid, i.rating
FROM dict_en AS e
JOIN de_en AS i ON ( i.en = e.id )
JOIN dict_de AS g ON i.de = g.id
WHERE (
g.word REGEXP '^(jacket)$'
AND g.id = i.de
)
OR (
e.word REGEXP '^(jacket)$'
AND e.id = i.en
)


... to make things even more confusing, REGEXP 'jacket' or LIKE 'jacket' will get the desired result with the second query. :confused:


Many thanks in advance for any useful hints!!!

DER_julu
08-03-08, 02:01
what versions of php and mySQL are running serverwise ?
edit://since my connect is super-unstable at the moment, i'll look at the problem tomorrow morning... can't even connect to the game because my ISP fucked up my line completely...

aKe`cj
08-03-08, 02:14
what versions of php and mySQL are running serverwise ?

PHP 5.1.4
mySQL 5.0.22

Zefrian
08-03-08, 03:58
ok, i created:



CREATE TABLE `de_en` (
`de` int(11) NOT NULL default '0',
`en` int(11) NOT NULL default '0'
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `de_en` VALUES (8, 15);
INSERT INTO `de_en` VALUES (7, 27);

CREATE TABLE `dict_de` (
`id` int(11) NOT NULL default '0',
`word` char(30) NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `dict_de` VALUES (8, 'Jacke');
INSERT INTO `dict_de` VALUES (7, 'Himmel');

CREATE TABLE `dict_en` (
`id` int(11) NOT NULL default '0',
`word` char(30) NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT INTO `dict_en` VALUES (15, 'jacket');
INSERT INTO `dict_en` VALUES (27, 'sky');


and executed your SQL Statements on an mysql 4.1.22 server with only ONE modification: i removed the undefined "i.rating"

Both querys gave the expected results.

:confused:

Maybe i misinterpreted something, because it is very late in the night here :) ... or theres some details missing for replicating your problem ... or your mysql version has got a bug :D

naimex
08-03-08, 09:59
Why dont you just use 1 language table, and add an extra field per language?


id english german spanish russian japanese italian whatever
x thanks danke gracias dazvedania arregato ? ?

Mighty Max
08-03-08, 11:06
In languages you have often multiple words mapped to one translation or even multiple words mapped to multiple translations. This would make one combined table very huge and slow.

Where clauses on runtime-joined tables will only be evaluated if they are possible to fire up. On such prejoined tables you'll evaluate the where on every line and your high performance database will be performing like a Trabbi (http://tbn0.google.com/images?q=tbn:26cs7ECPshQE0M:http://home.kpnqwest.cz/vla/Trabi/TrabiFotos/trabbi-jan.gif) ;)

naimex
08-03-08, 11:56
In languages you have often multiple words mapped to one translation or even multiple words mapped to multiple translations. This would make one combined table very huge and slow.

Where clauses on runtime-joined tables will only be evaluated if they are possible to fire up. On such prejoined tables you'll evaluate the where on every line and your high performance database will be performing like a Trabbi (http://tbn0.google.com/images?q=tbn:26cs7ECPshQE0M:http://home.kpnqwest.cz/vla/Trabi/TrabiFotos/trabbi-jan.gif) ;)


Yeah okay, but it all really depends how extensive your tables are going to get.


At the moment the most problems I'm running into, are tables with 100 000 -> 250000 rows in total, divided over 9 tables, that all relate to eachother, it does start slowing down everytime I have to pull out more than 9000 entries at the same time.

aKe`cj
08-03-08, 13:50
thanks for your responses and thank you for checking out the query first-hand zefrian. as so often, the real problem was not within the code, but that idiot in front of the screen.

The issue behind this was connected to the way I import the wordlists, leading to a left over CRLF at the end of each line (germanword#englishwordCRLF).
That probably explains while german serch terms worked fine, yet english did not. A trim() on the import worked wonders :o



Why dont you just use 1 language table, and add an extra field per language?

id english german spanish russian japanese italian whatever

MightyMax already pointed out the crucial issues with this.
There are multiple matches for "jacket" in german, which again have multiple matches in english, yet not necessarily all the same. As I dont use it for a multilingual website, but for an open dictionary, performance is relevant - as well as being able to do some other funky stuff with the match-tables.