By Jay on Wednesday, 11 April 2018
Posted in General
Replies 3
Likes 0
Views 637
Votes 0
Hey,
when posting link to events from our network it gives a very general preview, why is that? see images
The reason why it didn't retrieve the correct event title/description/image is because the crawler unable to access this event in public.

If you access this event without login, it will redirect to the login page.

In order to retrieve the correct event title/description/image , this event have to accessible by public.

You can refer on my attached screenshot below, this is how the stream layout for event.
·
Wednesday, 11 April 2018 23:08
·
0 Likes
·
0 Votes
·
0 Comments
·
I understand that. But I’m talking about a post within the network, not on Facebook. You would expect that the crawler would have a permission that anybody else in the network have!
Jay
·
Wednesday, 11 April 2018 23:32
·
0 Likes
·
0 Votes
·
0 Comments
·
I understand that anything it might be possible to achieve but it might impact on other thing at behind as well like performance issue.

For example : event link -> https://faenet.org/community-events/29-music-gatherette
1. When the user share an internal event link through the story form.
2. I assuming the system might be check this following thing in order to know whether this link really coming from one of the internal event link?
- check the link domain whether match with the current domain name
- check the last segment of that URL and see whether match the existing event id (29)
- check the last segment of that URL and see whether match the existing event alias (music-gatherette)

Possibilities issues :
- It might hitting timeout issue is because there is no way to determine whether this internal link is under group/page/event/stream and etc, so system will scan through all the existing database data like group/event/page/stream table and see whether match with those event id and alias.

In this situation, the system will spend long time to scan through all these, imagine if the site group/event/page/stream already reached 1 millions data from the database, end up it will hit the timeout error during scan through all these when the user share a link on the story form.

- If the site have using those 3rd party SEF extension, they can modify the event SEF url to other name through this extension.
For example : they do not want to show the event id beside the event alias in the URL.
In this situation, system unable to determine this event link whether the one user created in the internal network.

So the faster way to retrieve page HTML content without hitting those possibilities issue what i described at above, it needed to rely on the CURL functionality, so it can crawl on that event page to retrieve this event page meta information like those Opengraph, meta title/description data on the page.
·
Thursday, 12 April 2018 11:47
·
0 Likes
·
0 Votes
·
0 Comments
·
View Full Post