rss reading

A

Anonymous

Guest
Hi, I have teoretical problem with rss.

In example, site http://www.wdnews.net/ . How it works?

Ok, find rss channel, insert into database new items and show it. My problem is about cron or something. New items is showing on this site only few minutes after they are published on original web. If They use cron, how? Every ten minutes check all rss channels? many request....

and second problem. How I can distinguish new spots from old? If is not in rss something like pubdate or timestamp. (RSS 0.9 ?)
I can try to insert all items to database with unique URL adress and when it gaves error stop cyclus (allready been in database). But this looks very wrong....

Thans for reading this terrible english, If somebody help me I will be very happy.
 
Hey mm-marek!
mm-marek said:
New items is showing on this site only few minutes after they are published on original web

Wesite 1 is linking to website 2 feed file.
Since any feed has recent data... website 1 will have that recent data too.
Date and time (expired date/time) and/or content's category can also be used to separate/group this data from website 1.

There's a technic used from feed owners to prevent their own bandwidth consumption and also to inform 3rd party sites about its feed data last modification.

Here is a simple explanation on how it works:

Almost all aggregators store the date/time that a feed was last updated, and they pass this to the HTTP server via the If-Modified-Since HTTP header the next time they request the feed. If the feed hasn't changed since that date/time, the server returns an HTTP status code 304 to let the aggregator know the feed hasn't changed. So, the feed isn't re-downloaded when it hasn't changed, resulting in very little unnecessary bandwidth usage.

Hope it helps ;)
 
Back
Top