By Sam Famma on Thursday, 15 May 2014
Posted in General Issues
Replies 15
Likes 0
Views 766
Votes 0
Hi guys

I have been doing some testing and I have come across the social_stream database table. This table is going to be huge with only a hand full of users and very limited amount of information input the table has grown to
35 723.23 kb what will happen with 2000 users or 50 000 users ?
This is why it is extremely important for you to choose your stream generations wisely. For example I don't document logins because that one item would generate a lot of stream items per user. This is one of the big reasons why it's important for ES to be able to manage what stream items to generate. Fortunately ES is pretty generous in terms of allowing you to disable a lot of stream items. There are still more needed that I assume will be added some day.

Also eventually we will need auto purging. For example activity over a year old could be automatically removed if the admin specified it (currently doesn't exist). This way we save tons of space and still give users the activity they want to see. And for the folks who love going far back in time, that's what the profile apps are for.
·
Thursday, 15 May 2014 15:11
·
0 Likes
·
0 Votes
·
0 Comments
·
Okay I understand the importance of selecting the correct items to be generated on the stream and know the importance of purging data for this type of application.

But I wonder if there is not a better way of handling the stream in the database having so many entries in the one location will eventually slow down the system due to searching through such a huge database table. I was working on a project of a similar nature where we had to manage 3 600 050 location and we started off with the some approach and it did not work, till we broke the table up into a logical parent and child branches that automatically made the system faster due to how the information was retrieved for each request and also deflated the size of each table.

My main concern is when it comes to importing, exporting or backing up databases is going to become a big deal for most administrators. Even treating the table on its own will become difficult with a few thousand users. Most scripts will timeout on import.and some even on export.

On my current database all other es combined fields are at 773 kb the stream is at 40186 kb.

I will turn off other activity stream items and keep it to a bare minimum.
·
Thursday, 15 May 2014 16:07
·
0 Likes
·
0 Votes
·
0 Comments
·
There's no way around this Sam. The idea of building a stream is basically like "logging".
·
Thursday, 15 May 2014 16:24
·
0 Likes
·
0 Votes
·
0 Comments
·
Well I hope purging data to other tables climbs close to the top of the list Mark.
·
Thursday, 15 May 2014 17:21
·
0 Likes
·
0 Votes
·
0 Comments
·
Hello Sam,

I am not really sure what work around we could add for this as this is like "the only way" to store data
·
Friday, 16 May 2014 01:27
·
0 Likes
·
0 Votes
·
0 Comments
·
I'm not too technically inclined on how this works, but I suppose it might be possible to have it create a new stream table per year to reduce the huge load. But that can get complicated. As for the auto purging, the idea was purely meant for stream items and not for the activities themselves (we want old content to remain). Auto purging for private messages is also a must (eventually). I'd probably let the messages remain for 1.5-2 years.
·
Friday, 16 May 2014 02:21
·
0 Likes
·
0 Votes
·
0 Comments
·
Thanks for the heads up on this Josh
·
Friday, 16 May 2014 02:37
·
0 Likes
·
0 Votes
·
0 Comments
·
Hi Mark

I have been thinking about - It can be done - part in the code and part in the database the issue is that the database implementation will always need to be done by your customer.
I 'm thinking of creating a new database where the users archive files location can be stored and recalled on the fly from a storage archive. However it would need to have a daemon set to transport and retrieve the individual users archive folder and set it back into an archive filtered activity stream.
This would allow for almost unlimited storage and easy management of data similar to Microsoft Outlook archived folders.

I will try and design something out and try and make it as simple as possible by taking the Microsoft outlook approach and try to implement it via app. Something like automatically moving the user activity once it reaches a certain time or size but allowing the user to retrieve it whenever they wish for a certain amount of time till it gets restored back to the archive.

This should keep your existing table to a manageable size and each database request would not need to search through tones of data making the system much more efficient and responsive.

At the moment it's just a thought, but I have achieved this in the past in other projects.
·
Friday, 16 May 2014 02:40
·
0 Likes
·
0 Votes
·
0 Comments
·
Hello Sam,

I am not too sure what you are trying to achieve here but by reducing the number of records is already defeating the purpose of having a database with indexed items. The whole purpose of storing it in a db is so that we could quickly retrieve records based on the "index" .

If you want to create another table just to store archived data wouldn't that also get filled up over time eventually?
·
Friday, 16 May 2014 16:22
·
0 Likes
·
0 Votes
·
0 Comments
·
Right, I forgot about the indexing mechanism for speeding things up. That is comforting news.
·
Friday, 16 May 2014 16:31
·
0 Likes
·
0 Votes
·
0 Comments
·
Yeah, I really don't see what's wrong with a lot of data in a single table though
·
Friday, 16 May 2014 19:01
·
0 Likes
·
0 Votes
·
0 Comments
·
Hi Mark
this link below is similar to what we did a few years ago in tackling large data in the db but in order to work obviously the raw data needed to have been hashed in accordance. The only thing that we did extra was was transported data to other databases and kept the master db to a minimum it might give you some ideas.
There is not much wrong working with lots of data in the one database for a small application but the concept that you have introduced with Easysocial if tackled correctly in my opinion could compete with any major platform if scalability is taken into account at such an early stage.
·
Sunday, 18 May 2014 23:48
·
0 Likes
·
0 Votes
·
0 Comments
·
Thanks for the heads up on this Sam! Will watch the video!
·
Monday, 19 May 2014 01:43
·
0 Likes
·
0 Votes
·
0 Comments
·
I'm curious, what video?
·
Monday, 19 May 2014 03:57
·
0 Likes
·
0 Votes
·
0 Comments
·
Hello Josh,

Sam included a video in his post earlier. This is the video https://www.youtube.com/watch?v=uMxZ4RI6sCQ&html5=1
·
Monday, 19 May 2014 19:47
·
0 Likes
·
0 Votes
·
0 Comments
·
View Full Post