Cleaning up game titles on Details pages

feedback, questions and discussion relating to the Complete BBC Games Archive (beta site now open!)
Post Reply
User avatar
lurkio
Posts: 1765
Joined: Tue Apr 09, 2013 11:30 pm
Location: Doomawangara
Contact:

Cleaning up game titles on Details pages

Post by lurkio » Wed Apr 04, 2018 1:19 am

Paul, this is extremely non-urgent, so feel free to ignore it, but it occurred to me that the way that definite and indefinite articles ("The", "An", "A") are tagged onto the end of the game-name is fine in the backend database and in search results, but it looks a bit awkward in the masthead of a game's Details page (e.g. "Island, The").

I think the title can be cleaned up with this regex:
Untitled.png
That will move the article back to the beginning of the main game-name and will leave everything else alone. (Btw, the reason a left square bracket appears in the regex is that Lee and I have started putting square brackets around words we want to appear in search results and on the first line of the title at the top of a Details page.)

Would it be possible to apply this regex to the game title -- on Details pages only -- before you do any other text-processing of the title (e.g. before you look to see if it needs to be split onto two lines at the first parenthesis (round bracket))?

:?:
Last edited by lurkio on Wed Apr 04, 2018 4:00 pm, edited 2 times in total.

User avatar
leenew
Posts: 3661
Joined: Wed Jul 04, 2012 3:27 pm
Location: Doncaster, Yorkshire
Contact:

Re: Cleaning up game titles on Details pages

Post by leenew » Wed Apr 04, 2018 7:45 am

So to use the above example of "Island, The":
This would display as "The Island" but when browsing the site alphabetically, it would appear with the I's and not the T's. Is that correct?
That's what I would want!

Lee.

User avatar
pau1ie
Posts: 552
Joined: Thu May 10, 2012 9:48 pm
Location: Bedford
Contact:

Re: Cleaning up game titles on Details pages

Post by pau1ie » Wed Apr 04, 2018 8:33 am

I will look into it. Have you run this against all the titles to make sure it doesn't mess anything up? The regex seems pretty robust at first glance though (It will only match the words you specify).

Do you think this is the best way to solve this problem in the long term? It seems to me that you are denormalizing the data, (I know it wasn't totally normalised to start with) which can cause problems later. For example if I want to search for a pre-release game, do I type pre-release into the search box, or do I click the "unreleased game" button? It seems you are putting metadata in the title field. We have a database for that.

I would strongly consider splitting the article into a separate field, and the other metadata as well, then considering how you want it to be displayed, rather than enforcing rules on the format of the title which someone else might not understand, and are difficult to be consistent with anyway.

I hope this makes sense.
I'm working on http://bbcmicro.co.uk

User avatar
lurkio
Posts: 1765
Joined: Tue Apr 09, 2013 11:30 pm
Location: Doomawangara
Contact:

Re: Cleaning up game titles on Details pages

Post by lurkio » Wed Apr 04, 2018 11:28 am

pau1ie wrote:I would strongly consider splitting the article into a separate field, and the other metadata as well, then considering how you want it to be displayed
Yes, that would be the better solution. The reason I didn't suggest it is that I didn't want to make (too much) more work for anyone! But yes, that would be the way to go, in an ideal world.
pau1ie wrote:if I want to search for a pre-release game, do I type pre-release into the search box, or do I click the "unreleased game" button? It seems you are putting metadata in the title field. We have a database for that.
It's worth mentioning that the reason we started putting tags in square brackets just after game-names is that we wanted to disambiguate different versions of a game at a glance in search results: e.g. the search results for Citadel previously didn't differentiate the pre-release from the final release -- but now they do. But yes, I still think a better solution would be to put the tags in a new field in the database.
pau1ie wrote:Have you run this against all the titles to make sure it doesn't mess anything up?
Yes, the regex works on all the relevant titles in the database currently. But I just realised it might fail on something like (made-up example) "Good, The Bad, And The Ugly, The" unless the asterisk operator is "greedy" in PHP -- which I don't know if it is.
leenew wrote:So to use the above example of "Island, The": This would display as "The Island" but when browsing the site alphabetically, it would appear with the I's and not the T's. Is that correct?
Well, what I had in mind was that in search results, or when you're browsing the site alphabetically, the name would still appear as "Island, The" because the regex would only be applied on the Details page for the game. It's only at the top of the Details page that the name would appear as "The Island".

:idea:

User avatar
pau1ie
Posts: 552
Joined: Thu May 10, 2012 9:48 pm
Location: Bedford
Contact:

Re: Cleaning up game titles on Details pages

Post by pau1ie » Wed Apr 04, 2018 5:13 pm

Can you have a think about what the ideal world would look like? I don't have time to work on it now, so will probably end up doing what you originally asked as a stop gap, but long term it is likely to cause more work than it saves especially if we keep adding quick hacks, so hopefully I will have time to work on it eventually.

It looks like you want to add tags to games. These seem to overlap to an extent with the release types. Should we replace the release types with tags which can be added to more than one game?

The title field contains a lot of information. Which parts should be split out? Here are some examples to think about:
  • Article as already mentioned (The, A, An) should be. The information in brackets seems to be good candidates. This seems to fall into:
  • AKA. I could make some fields which are also searched, I don't think there are more than about 4 AKAs per game.
  • +Editor. Should this be some kind of tag?
  • Parts 1-2 Presumably these games were delivered together. We already have a series and series number which are both free text. Could we use those?
  • Rick Hanson Trilogy part 3 / Ket Trilogy / Chronicles of Grotty Betty - Probably belongs in series and number?
  • Pop Quiz Master Database DIsc n - Not sure about this one. Maybe there should be a "thing" for add on discs which the picture discs for adventures could also go into.
  • Complete Home Entertainment Centre - I think this is probably a compilation?
  • W.A.R. (Game 1 Only) - This is probably a series?
  • Christmas Crackers, and One line games is a compilation. The database isn't really set up for compilations that aren't split out, so I think they should be left as they are. The "correct" thing to do would be to have each on their own SSD, but that seems silly in these cases.
  • Hack / Pre-release - These should be tags. We can think how they display in the index. maybe some kind of emoji which has a mouse over to explain what it does?
I envisage tags working like the following. They have a description, Short text (Maybe dagger, asterisk, or I like the idea of UTF-8 emoji) as a flag as to whether they are displayed in the index screen. They are linked with games in a similar manner to secondary genres. Then, should we dispense with release types and just have these tags?

Food for thought, and no promises as to how fast I will get round to it.
I'm working on http://bbcmicro.co.uk

User avatar
richardtoohey
Posts: 3590
Joined: Thu Dec 29, 2011 5:13 am
Location: Tauranga, New Zealand
Contact:

Re: Cleaning up game titles on Details pages

Post by richardtoohey » Thu Apr 05, 2018 9:06 am

Searchable tags work well for this sort of thing so that's a good plan.

So you can also include other names, misspellings etc. and ditch words like "an" and "the".

More than one way to skin the tag cat (I've used a keywords field with full-text index in some projects, had a separate tag table for others, and bound to be other/better ways!)

So for "The Island" - the tags could include "island". If an adventure game, "adventure".

Rick Hanson Trilogy - tags "rick", "hanson", "trilogy".

But I'm not doing the work so :-# from me!

User avatar
pau1ie
Posts: 552
Joined: Thu May 10, 2012 9:48 pm
Location: Bedford
Contact:

Re: Cleaning up game titles on Details pages

Post by pau1ie » Thu Apr 05, 2018 1:27 pm

I think that is mostly already covered by genres. Adventure games all have a genre of adventure. I don't think there is an advantage of using a tag if "Island" for "The Island" because we are already proposing to move "The" into a different field rather than a tag, and searching for "Island" will already show that game. Tags are more a development of existing release types and the additional text in square brackets that Lee and Lurkio have been adding (Mostly pre-release or hack).

So the game table will include the following fields:

Code: Select all

article, title, aka1, aka2, aka3, aka4, then the rest.
Or maybe AKAs should be in another table...

Code: Select all

gameid, aka
I'm working on http://bbcmicro.co.uk

User avatar
pau1ie
Posts: 552
Joined: Thu May 10, 2012 9:48 pm
Location: Bedford
Contact:

Re: Cleaning up game titles on Details pages

Post by pau1ie » Sun Apr 08, 2018 10:39 am

I have a little time today, so I will make a start on the article.
I am going to make the end result what you wanted, i.e. the article comes at the end on the index page and at the beginning on the details page. Because I am splitting out the field, you could also have it at the beginning on the index page and not affect the sort order.

I need to change the details page, the index page, and of course the admin page. The following SQL sorts out all the rows apart from

Hunt, The: Search For Shauna, (Because of the colon) which can be done manually (SQL doesn't have regular expression replace).

Code: Select all

UPDATE `games` 
 set title_article = 'The', title=REPLACE(title,', The','') 
 WHERE title like '%, The' or title like '%, The %'

UPDATE `games`
 set title_article = 'A', title=REPLACE(title,', A','')
 WHERE title like '%, A' or title like '%, A %'
I'm working on http://bbcmicro.co.uk

User avatar
pau1ie
Posts: 552
Joined: Thu May 10, 2012 9:48 pm
Location: Bedford
Contact:

Re: Cleaning up game titles on Details pages

Post by pau1ie » Sun Apr 08, 2018 10:43 am

The other thing I need to change of course is the spreadsheet at ss.php. Can I add a column before the title without messing up your comparison too much Lee?
I'm working on http://bbcmicro.co.uk

User avatar
leenew
Posts: 3661
Joined: Wed Jul 04, 2012 3:27 pm
Location: Doncaster, Yorkshire
Contact:

Re: Cleaning up game titles on Details pages

Post by leenew » Sun Apr 08, 2018 11:36 am

Hi Paul,
The ss.php is my absolute reference now, so you can add anything to it :D
If there is any work for me to do, just let me know.
I can usually get away with doing a little titivating spreadsheets at work :wink:

Lee

User avatar
pau1ie
Posts: 552
Joined: Thu May 10, 2012 9:48 pm
Location: Bedford
Contact:

Re: Cleaning up game titles on Details pages

Post by pau1ie » Sun Apr 08, 2018 1:10 pm

The next post is what changed (Sorry it is so long)

Code: Select all

SELECT
concat('[*]',title_article,' ','[url=http://bbcmicro.co.uk/game.php?id=',id,']',title,'[/url]')
FROM `games` WHERE title_article is not null and title_article > ' '
The article is displayed at the front without being part of the URL to show it is part of a different column. I like the way this also tidies up the type ahead suggestions, which I think would also benefit from having the various bits in brackets removed, which my other suggestions would also do. I will leave this as it is for the moment as there doesn't seem to me much enthusiasm, and I don't have loads of time anyway!

Let me know if I have messed anything up. I obviously kept a copy of the database before I changed stuff, so I can put it back if necessary.
I'm working on http://bbcmicro.co.uk

User avatar
pau1ie
Posts: 552
Joined: Thu May 10, 2012 9:48 pm
Location: Bedford
Contact:

Re: Cleaning up game titles on Details pages

Post by pau1ie » Sun Apr 08, 2018 1:10 pm

I'm working on http://bbcmicro.co.uk


User avatar
leenew
Posts: 3661
Joined: Wed Jul 04, 2012 3:27 pm
Location: Doncaster, Yorkshire
Contact:

Re: Cleaning up game titles on Details pages

Post by leenew » Sun Apr 08, 2018 3:38 pm

Thanks Paul, that is really good [-o<

Lee.

Post Reply