phpBB2Refugees.com Logo
Not affiliated with or endorsed by the phpBB Group

Register •  Login 

Continue the legacy...

Welcome to all phpBB2 Refugees!Wave Smilie

This site is intended to continue support for the legacy 2.x line of the phpBB2 bulletin board package. If you are a fan of phpBB2, please, by all means register, post, and help us out by offering your suggestions. We are primarily a community and support network. Our secondary goal is to provide a phpBB2 MOD Author and Styles area.

Fast Query for Viewtopic to calculate viewforum page?


 
Search this topic... | Search phpBB2 Discussion... | Search Box
Register or Login to Post    Index » phpBB2 Discussion  Previous TopicPrint TopicNext Topic
Author Message
JLA
Board Member



Joined: 30 Apr 2009

Posts: 298
Location: U.S.A


flag
PostPosted: Sat Dec 06, 2014 5:19 pm 
Post subject: Fast Query for Viewtopic to calculate viewforum page?

Has anyone written a query that can easily (when viewing viewtopic) calculate which page on viewforum that particular topic appears? Looking for this for the breadcrumbs. Keep in mind some of the forums have 10's of thousands of pages.
_________________
http://www.jlaforums.com
Back to top
Vendethiel
Board Member



Joined: 26 Oct 2014

Posts: 55



PostPosted: Sat Dec 06, 2014 7:16 pm 
Post subject: Re: Fast Query for Viewtopic to calculate viewforum page?

Code:

SELECT count(topic_id)
FROM phpbb_topics
WHERE topic_id > {your topic ID}
  AND forum_id = {your forum ID}
ORDER BY topic_last_post_id


And you divide this by the number of topics per page. Not sure that's really fast, though :/

_________________
Developer on EzArena, the ADR premod.
Developer on Icy Phoenix, the phpBB hybrid cms.
Developer on IntegraMOD.
Back to top
JLA
Board Member



Joined: 30 Apr 2009

Posts: 298
Location: U.S.A


flag
PostPosted: Sat Dec 06, 2014 8:45 pm 
Post subject: Re: Fast Query for Viewtopic to calculate viewforum page?

Vendethiel wrote:
Code:

SELECT count(topic_id)
FROM phpbb_topics
WHERE topic_id > {your topic ID}
  AND forum_id = {your forum ID}
ORDER BY topic_last_post_id


And you divide this by the number of topics per page. Not sure that's really fast, though :/


That would be bad - especially on forum sections that contain millions of topics icon_sad.gif

_________________
http://www.jlaforums.com
Back to top
drathbun
Board Member



Joined: 24 Jul 2008

Posts: 653
Location: Texas


flag
PostPosted: Wed Dec 10, 2014 2:28 am 
Post subject: Re: Fast Query for Viewtopic to calculate viewforum page?

What indexes do you have on phpbb_topics right now?
_________________
phpBBDoctor Blog
Back to top
JLA
Board Member



Joined: 30 Apr 2009

Posts: 298
Location: U.S.A


flag
PostPosted: Thu Dec 11, 2014 3:25 pm 
Post subject: Re: Fast Query for Viewtopic to calculate viewforum page?

A few different ones but curious to know what indexes could help a problem like this. The problem I'm seeing with trying to calculate which forum page the topic sits on is

* We are talking about hundreds of millions of topics in over 1000 forum sections

* Say for example if a topic is posted in forum X within the last 30 seconds - chances are the topic sits on page 1 of XXXXXXX of forum X. But a few minutes later it might sit on page 2 since new topics will have been added to that forum and pushed our example topic further down the list. But if the topic receives a reply then it then moves back to page 1 - so trying to calculate the current forum page the topic sits on turns into a much more complicated query-queries. Of course in "some" cases you could use a referrer if a user comes to that topic from a specific forum page - but can change quickly and refs do not always work and alot of users come to topics without even visiting the forum page that topic might be on at the moment (links, organic search, etc)

_________________
http://www.jlaforums.com
Back to top
drathbun
Board Member



Joined: 24 Jul 2008

Posts: 653
Location: Texas


flag
PostPosted: Fri Dec 12, 2014 11:37 pm 
Post subject: Re: Fast Query for Viewtopic to calculate viewforum page?

Indexes are used to optimize queries. With the proper index, you should be able to answer your question nearly instantly. The combination of last_post_id and forum_id should give you everything you know for the topic, yes?

If you have a million topics in a single forum, that's a million topic_last_post_id values for that forum. Conveniently topic_last_post_id is also unique, and it's also what determines the order of topics on a forum. It has been a while since I looked at a stock phpBB schema so I pulled this out of the installation file:
Code:
CREATE TABLE phpbb_topics (
   topic_id mediumint(8) UNSIGNED NOT NULL auto_increment,
   forum_id smallint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_title char(60) NOT NULL,
   topic_poster mediumint(8) DEFAULT '0' NOT NULL,
   topic_time int(11) DEFAULT '0' NOT NULL,
   topic_views mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_replies mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_status tinyint(3) DEFAULT '0' NOT NULL,
   topic_vote tinyint(1) DEFAULT '0' NOT NULL,
   topic_type tinyint(3) DEFAULT '0' NOT NULL,
   topic_first_post_id mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_last_post_id mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_moved_id mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   PRIMARY KEY (topic_id),
   KEY forum_id (forum_id),
   KEY topic_moved_id (topic_moved_id),
   KEY topic_status (topic_status),
   KEY topic_type (topic_type)
);

There is a primary key on topic_id, which is to be expected. There is a key on forum_id which helps the system quickly determine a list of tpoics that go in a forum. And so on. What's missing is any sort of index on topic_last_post_id.

On my system I have the following indexes:
Primary Key: topic_id
Forum ID: forum_id
Topic Moved: topic_moved_id
Topic Status: topic_status
Topic Type: topic_type
Topic Last Post ID: topic_last_post_id (nonstandard index)
Viewforum Index: forum_id, topic_type, topic_label, topic_last_post_id (nonstandard index)

The last two indexes are not standard; I have added them over the years for optimization. The very last index is designed specifically for my board where folks can filter by topic label. For example on this board in the MODs area you can filter by topic status (DEV, BETA, and so on). It includes the two columns we are interested in (forum_id and topic_last_post_id) but has additional items as well. Does that matter? It turns out that it does. More on that in a moment.

The easy way to determine which page a topic appears on would be done by counting the number of topics within the same forum that have a topic_last_post_id > the topic_last_post_id of the topic in question. To do that, I need two pieces of information: the forum_id and the topic_last_post_id. The last index I have above has both of those pieces of information, but there are two additional elements that are "in the way" of using this index. What that means is this: an index starts with the first column and uses the rest in order. So assume there's an index with a combination of forum_id + topic_last_post_id, that index can be used with a where clause on forum_id or forum_id + topic_last_post_id but it will not be used for topic_last_post_id by itself because the first column is missing.

So my index of forum_id, topic_type, topic_label, and topic_last_post_id would not help for this specific question because you're not providing the topic type or label.

To find out what index a query might use, you use the "explain" command. In a test board of mine I have a forum (39) with just over 29K topics. It's a long way from a million icon_smile.gif but the concepts are the same. The topic with topic_last_post_id of 906661 appears on page 3 of the forum. Here's my query and the results:
Code:
mysql> select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+---------------------------+
| count(topic_last_post_id) |
+---------------------------+
|                        50 |
+---------------------------+
1 row in set (0.00 sec)

But what did the database do to arrive at that number? How did it run the query?
Code:
mysql> explain select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+----+-------------+---------------+-------+---------------------------------------------+--------------------+---------+------+------+-------------+
| id | select_type | table         | type  | possible_keys                               | key                | key_len | ref  | rows | Extra       |
+----+-------------+---------------+-------+---------------------------------------------+--------------------+---------+------+------+-------------+
|  1 | SIMPLE      | busobj_topics | range | forum_id,topic_last_post_id,viewforum_index | topic_last_post_id | 3       | NULL |  214 | Using where |
+----+-------------+---------------+-------+---------------------------------------------+--------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

The query plan says it could have used the indexes on forum_id, topic_last_post_id, or the viewforum_index that is my custom index I mentioned above. The key that was actually used was topic_last_post_id, which is not a standard phpBB index either. So at the very least, you would want to create this index. But if I put my DBA hat on, what I would really want is a combined index of forum_id + topic_last_post_id. The question is, what order do they go in?

Suppose I create an index on forum_id + topic_last_post_id and run the same query.
Code:
mysql> create unique index test_u1 on busobj_topics (forum_id, topic_last_post_id);
Query OK, 51900 rows affected (0.70 sec)
Records: 51900  Duplicates: 0  Warnings: 0

It's a unique index; that's important because it should help the optimizer. Here's the query plan.
Code:
mysql> explain select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+----+-------------+---------------+-------+-----------------------------------------------------+---------+---------+------+------+--------------------------+
| id | select_type | table         | type  | possible_keys                                       | key     | key_len | ref  | rows | Extra                    |
+----+-------------+---------------+-------+-----------------------------------------------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | busobj_topics | range | test_u1,forum_id,topic_last_post_id,viewforum_index | test_u1 | 5       | NULL |   48 | Using where; Using index |
+----+-------------+---------------+-------+-----------------------------------------------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

This is excellent... see the "using index" part at the end? That says the query is able to answer the question using only the data in the index; it doesn't even need to go to the table to retrieve any data. If you have split your indexes on a different disk system than your tables, that means you can do simultaneous data queries and index queries, getting twice as much done at once. What if the index was created in the opposite order? There are far more unique topic_last_post_id values than forum_id values. It would seem that maybe the query could be more efficient if I first got the topics with a newer last post and then filtered by forum.
Code:
mysql> create index test_u2 on busobj_topics (topic_last_post_id, forum_id);
Query OK, 51900 rows affected (0.56 sec)
Records: 51900  Duplicates: 0  Warnings: 0

As before this is a unique index. What does the explain plan look like... now it has two possible indexes to choose from?
Code:
mysql> explain select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+----+-------------+---------------+-------+-------------------------------------------------------------+---------+---------+------+------+--------------------------+
| id | select_type | table         | type  | possible_keys                                               | key     | key_len | ref  | rows | Extra                    |
+----+-------------+---------------+-------+-------------------------------------------------------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | busobj_topics | range | test_u1,forum_id,topic_last_post_id,viewforum_index,test_u2 | test_u1 | 5       | NULL |   48 | Using where; Using index |
+----+-------------+---------------+-------+-------------------------------------------------------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

It's still using the first index I created. Why? Because it's able to apply a where clause to the first elements of the index and eliminate a (potentially huge) number of non-interesting topics right away. Then it does a count on the rows that pass the where clause to answer the second part of the question.

Think of an index as a tree, because that's essentially what it is. Imagine you have 20 different forums. The first branches of the index are by forum. By specifying a forum first, I eliminate 19 branches of the tree, or 95% of the data immediately. What's left is to traverse the sub-branches of the index to count topic_last_post_id values that are greater than my target number. This query is going to be the fastest for the newer topics. For the very last topic on your board, it's going to be slower. But if you think about it, that's okay. Your newer topics are far more likely to be viewed than your older topics, so you want that process to be as fast as possible.

Still with me so far? icon_smile.gif This is why I asked what indexes you already have on your topics table. Given the size of your board, I would be surprised if you had not added a few performance-based indexes already. If you get too many indexes, then you slow down updates and inserts. If you're only running queries, indexes generally remain stable.

What I would try is to first create a unique index on the combination of forum_id and topic_last_post_id in that order and then try running a few queries to do the count. It should be exactly as I wrote it above; if you do a count(*) then you can potentially cause the optimizer to think it has to go to the table, and we don't want that. By counting a unique column from the index we avoid using the table altogether.

But we're still not done. I have 25 topics displayed per page, and the query I ran earlier counted 50 topics that were newer than my selected topics.
Code:
mysql> select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+---------------------------+
| count(topic_last_post_id) |
+---------------------------+
|                        50 |
+---------------------------+
1 row in set (0.00 sec)

The problem is, there aren't 50 topics that appear before this one in order, there are 52. Where did the other two disappear to? Ha, they're stickies. icon_smile.gif

So if you want to be as accurate as possible, I would want to do this:
Code:
select count(topics) from tables
where forum = my_forum
and topic_last_post_id > my topic_last_post_id
plus
select count(stickies) from my_forum
plus
select count(announcements) from my_forum

That should do it. The thing is, having to recalculate the number of stickies and announcements every time is unnecessary overhead, because I can expect that number won't change very often. So what I would do is count and store the number of stickies and number of announcements and store them as fields in phpbb_forums, and retrieve them at the top of every page where you get the forum_name and other forum attributes. That way you add those pre-calculated numbers to the count of topics and then finally divide by the number of topics per page to arrive at your "best guess" as to what page you're on. It could be off by a page, based on how active the forum is, but it's going to be really close, and it's also going to be really efficient. Or at least as efficient as I can come up with for you right now. icon_smile.gif

_________________
phpBBDoctor Blog
Back to top
JLA
Board Member



Joined: 30 Apr 2009

Posts: 298
Location: U.S.A


flag
PostPosted: Wed Dec 17, 2014 7:20 pm 
Post subject: Re: Fast Query for Viewtopic to calculate viewforum page?

1st - let me just start by saying Thank You VERY MUCH for taking the time to post this reply. It is very much appreciated!!

You are correct - we do have different indexes for performance and one that is along the lines of the one needed in your example. You are correct, that running the query (the query time) for newer topics is not too bad but for those forums with many topics, the time for much older topics (on which we get considerable traffic) the query times become unacceptable. We tested on one forum (with over 10 million topics) on the last topic of that forum. Counting the topic_last_post_id in that forum where the topic_last_post_id was greater than the last post id of that topic too much too long. Using the explain it used the appropriate key but still had to go through over 6.3 million rows



drathbun wrote:
Indexes are used to optimize queries. With the proper index, you should be able to answer your question nearly instantly. The combination of last_post_id and forum_id should give you everything you know for the topic, yes?

If you have a million topics in a single forum, that's a million topic_last_post_id values for that forum. Conveniently topic_last_post_id is also unique, and it's also what determines the order of topics on a forum. It has been a while since I looked at a stock phpBB schema so I pulled this out of the installation file:
Code:
CREATE TABLE phpbb_topics (
   topic_id mediumint(8) UNSIGNED NOT NULL auto_increment,
   forum_id smallint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_title char(60) NOT NULL,
   topic_poster mediumint(8) DEFAULT '0' NOT NULL,
   topic_time int(11) DEFAULT '0' NOT NULL,
   topic_views mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_replies mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_status tinyint(3) DEFAULT '0' NOT NULL,
   topic_vote tinyint(1) DEFAULT '0' NOT NULL,
   topic_type tinyint(3) DEFAULT '0' NOT NULL,
   topic_first_post_id mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_last_post_id mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   topic_moved_id mediumint(8) UNSIGNED DEFAULT '0' NOT NULL,
   PRIMARY KEY (topic_id),
   KEY forum_id (forum_id),
   KEY topic_moved_id (topic_moved_id),
   KEY topic_status (topic_status),
   KEY topic_type (topic_type)
);

There is a primary key on topic_id, which is to be expected. There is a key on forum_id which helps the system quickly determine a list of tpoics that go in a forum. And so on. What's missing is any sort of index on topic_last_post_id.

On my system I have the following indexes:
Primary Key: topic_id
Forum ID: forum_id
Topic Moved: topic_moved_id
Topic Status: topic_status
Topic Type: topic_type
Topic Last Post ID: topic_last_post_id (nonstandard index)
Viewforum Index: forum_id, topic_type, topic_label, topic_last_post_id (nonstandard index)

The last two indexes are not standard; I have added them over the years for optimization. The very last index is designed specifically for my board where folks can filter by topic label. For example on this board in the MODs area you can filter by topic status (DEV, BETA, and so on). It includes the two columns we are interested in (forum_id and topic_last_post_id) but has additional items as well. Does that matter? It turns out that it does. More on that in a moment.

The easy way to determine which page a topic appears on would be done by counting the number of topics within the same forum that have a topic_last_post_id > the topic_last_post_id of the topic in question. To do that, I need two pieces of information: the forum_id and the topic_last_post_id. The last index I have above has both of those pieces of information, but there are two additional elements that are "in the way" of using this index. What that means is this: an index starts with the first column and uses the rest in order. So assume there's an index with a combination of forum_id + topic_last_post_id, that index can be used with a where clause on forum_id or forum_id + topic_last_post_id but it will not be used for topic_last_post_id by itself because the first column is missing.

So my index of forum_id, topic_type, topic_label, and topic_last_post_id would not help for this specific question because you're not providing the topic type or label.

To find out what index a query might use, you use the "explain" command. In a test board of mine I have a forum (39) with just over 29K topics. It's a long way from a million icon_smile.gif but the concepts are the same. The topic with topic_last_post_id of 906661 appears on page 3 of the forum. Here's my query and the results:
Code:
mysql> select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+---------------------------+
| count(topic_last_post_id) |
+---------------------------+
|                        50 |
+---------------------------+
1 row in set (0.00 sec)

But what did the database do to arrive at that number? How did it run the query?
Code:
mysql> explain select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+----+-------------+---------------+-------+---------------------------------------------+--------------------+---------+------+------+-------------+
| id | select_type | table         | type  | possible_keys                               | key                | key_len | ref  | rows | Extra       |
+----+-------------+---------------+-------+---------------------------------------------+--------------------+---------+------+------+-------------+
|  1 | SIMPLE      | busobj_topics | range | forum_id,topic_last_post_id,viewforum_index | topic_last_post_id | 3       | NULL |  214 | Using where |
+----+-------------+---------------+-------+---------------------------------------------+--------------------+---------+------+------+-------------+
1 row in set (0.00 sec)

The query plan says it could have used the indexes on forum_id, topic_last_post_id, or the viewforum_index that is my custom index I mentioned above. The key that was actually used was topic_last_post_id, which is not a standard phpBB index either. So at the very least, you would want to create this index. But if I put my DBA hat on, what I would really want is a combined index of forum_id + topic_last_post_id. The question is, what order do they go in?

Suppose I create an index on forum_id + topic_last_post_id and run the same query.
Code:
mysql> create unique index test_u1 on busobj_topics (forum_id, topic_last_post_id);
Query OK, 51900 rows affected (0.70 sec)
Records: 51900  Duplicates: 0  Warnings: 0

It's a unique index; that's important because it should help the optimizer. Here's the query plan.
Code:
mysql> explain select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+----+-------------+---------------+-------+-----------------------------------------------------+---------+---------+------+------+--------------------------+
| id | select_type | table         | type  | possible_keys                                       | key     | key_len | ref  | rows | Extra                    |
+----+-------------+---------------+-------+-----------------------------------------------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | busobj_topics | range | test_u1,forum_id,topic_last_post_id,viewforum_index | test_u1 | 5       | NULL |   48 | Using where; Using index |
+----+-------------+---------------+-------+-----------------------------------------------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

This is excellent... see the "using index" part at the end? That says the query is able to answer the question using only the data in the index; it doesn't even need to go to the table to retrieve any data. If you have split your indexes on a different disk system than your tables, that means you can do simultaneous data queries and index queries, getting twice as much done at once. What if the index was created in the opposite order? There are far more unique topic_last_post_id values than forum_id values. It would seem that maybe the query could be more efficient if I first got the topics with a newer last post and then filtered by forum.
Code:
mysql> create index test_u2 on busobj_topics (topic_last_post_id, forum_id);
Query OK, 51900 rows affected (0.56 sec)
Records: 51900  Duplicates: 0  Warnings: 0

As before this is a unique index. What does the explain plan look like... now it has two possible indexes to choose from?
Code:
mysql> explain select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+----+-------------+---------------+-------+-------------------------------------------------------------+---------+---------+------+------+--------------------------+
| id | select_type | table         | type  | possible_keys                                               | key     | key_len | ref  | rows | Extra                    |
+----+-------------+---------------+-------+-------------------------------------------------------------+---------+---------+------+------+--------------------------+
|  1 | SIMPLE      | busobj_topics | range | test_u1,forum_id,topic_last_post_id,viewforum_index,test_u2 | test_u1 | 5       | NULL |   48 | Using where; Using index |
+----+-------------+---------------+-------+-------------------------------------------------------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

It's still using the first index I created. Why? Because it's able to apply a where clause to the first elements of the index and eliminate a (potentially huge) number of non-interesting topics right away. Then it does a count on the rows that pass the where clause to answer the second part of the question.

Think of an index as a tree, because that's essentially what it is. Imagine you have 20 different forums. The first branches of the index are by forum. By specifying a forum first, I eliminate 19 branches of the tree, or 95% of the data immediately. What's left is to traverse the sub-branches of the index to count topic_last_post_id values that are greater than my target number. This query is going to be the fastest for the newer topics. For the very last topic on your board, it's going to be slower. But if you think about it, that's okay. Your newer topics are far more likely to be viewed than your older topics, so you want that process to be as fast as possible.

Still with me so far? icon_smile.gif This is why I asked what indexes you already have on your topics table. Given the size of your board, I would be surprised if you had not added a few performance-based indexes already. If you get too many indexes, then you slow down updates and inserts. If you're only running queries, indexes generally remain stable.

What I would try is to first create a unique index on the combination of forum_id and topic_last_post_id in that order and then try running a few queries to do the count. It should be exactly as I wrote it above; if you do a count(*) then you can potentially cause the optimizer to think it has to go to the table, and we don't want that. By counting a unique column from the index we avoid using the table altogether.

But we're still not done. I have 25 topics displayed per page, and the query I ran earlier counted 50 topics that were newer than my selected topics.
Code:
mysql> select count(topic_last_post_id) from busobj_topics where forum_id = 39 and topic_last_post_id > 906661;
+---------------------------+
| count(topic_last_post_id) |
+---------------------------+
|                        50 |
+---------------------------+
1 row in set (0.00 sec)

The problem is, there aren't 50 topics that appear before this one in order, there are 52. Where did the other two disappear to? Ha, they're stickies. icon_smile.gif

So if you want to be as accurate as possible, I would want to do this:
Code:
select count(topics) from tables
where forum = my_forum
and topic_last_post_id > my topic_last_post_id
plus
select count(stickies) from my_forum
plus
select count(announcements) from my_forum

That should do it. The thing is, having to recalculate the number of stickies and announcements every time is unnecessary overhead, because I can expect that number won't change very often. So what I would do is count and store the number of stickies and number of announcements and store them as fields in phpbb_forums, and retrieve them at the top of every page where you get the forum_name and other forum attributes. That way you add those pre-calculated numbers to the count of topics and then finally divide by the number of topics per page to arrive at your "best guess" as to what page you're on. It could be off by a page, based on how active the forum is, but it's going to be really close, and it's also going to be really efficient. Or at least as efficient as I can come up with for you right now. icon_smile.gif

_________________
http://www.jlaforums.com
Back to top
drathbun
Board Member



Joined: 24 Jul 2008

Posts: 653
Location: Texas


flag
PostPosted: Wed Dec 17, 2014 11:33 pm 
Post subject: Re: Fast Query for Viewtopic to calculate viewforum page?

But how often will someone really pull up the last topic on a forum? I get it... if they do pull it up, you don't want to crater the server while the system figures out the pagination. So let's talk about making some assumptions, and see if we can get a "close enough" guess for the pagination to make it work, understanding that there's no guarantee that the pagination for the 1,000,000th topic in a forum is going to be accurate. icon_smile.gif

First, I would look at the date for the last post on a topic. If it's less than "X" days old, consider it defined as a "new" topic and use the logic mentioned previously. The value of "X" would be determined by how active your forum is, or forums are because you probably don't have the same traffic patterns across every forum on your board. As a starting point, find out what date you need to use to capture the newest 5% of the topics for a particular forum. Those topics would have accurate pagination calculated using the process outlined above. If 5% doesn't work, extend it to 10%, or shorten it to 3%, until you find a threshold that works.

What happens if the topic is older than "X" days old when you check? I need a "best guess" process, and to make that guess I need to know approximately how many pages of topics there are in the forum, and then approximately how far I am from the end.

First, the approximate pages for a forum... On a standard phpbb_forums table you already have a field that contains the total number of topics in that forum. You also know how many topics are displayed per page; that's part of the board configuration. That gives me a way to determine how many pages would be required to display all of the topics for a forum, which is a start. I just need a way to figure out where a really old topic is without actually counting it every time someone pulls it up.

Keep in mind that as soon as a topic gets a new post, it gets promoted to the 5% and it will then have a dynamic calculation again.

I have some thoughts on where to go next, but I have to go cook dinner first. icon_smile.gif I'll see what I can come up with later.

_________________
phpBBDoctor Blog
Back to top
drathbun
Board Member



Joined: 24 Jul 2008

Posts: 653
Location: Texas


flag
PostPosted: Wed Dec 17, 2014 11:35 pm 
Post subject: Re: Fast Query for Viewtopic to calculate viewforum page?

... here's one quick idea that occurred to me just as I posted. For the oldest 5% (or 10% or 3%) you use the exact same process as the newest 5%, you just turn the query around. Instead of finding how many topics are newer, find out how many topics are older. Using the total topics for the forum and the number of topics per page, you should be able to get a fairly quick result for the oldest 5% just like you can for the newest 5%. That just leaves the middle.
_________________
phpBBDoctor Blog
Back to top
Display posts from previous:   
Register or Login to Post    Index » phpBB2 Discussion  Previous TopicPrint TopicNext Topic
Page 1 of 1 All times are GMT
 
Jump to:  

Index • About • FAQ • Rules • Privacy • Search •  Register •  Login 
Not affiliated with or endorsed by the phpBB Group
Powered by phpBB2 © phpBB Group
Generated in 0.0286 seconds using 15 queries. (SQL 0.0016 Parse 0.0006 Other 0.0264)
phpBB Customizations by the phpBBDoctor.com
Template Design by DeLFlo and MomentsOfLight.com Moments of Light Logo