Duplicate posts in .Text blogs

For anyone that noticed a lot of duplicate posts recently in .Text weblogs, this is not a SharpReader bug, but rather a result of some changes in the rss generated by .Text in a recent version.

These changes are:

  • <guid>, which went from http://{site}/{blogname}/posts/{post-number}.aspx to http://{site}/{blogname}/archive/{year}/{month}/{day}/{post-number}.aspx
  • <link>, which was changed in the same way.
  • <description>, which now adds an invisible <img> to the end which serves to report back to the server when someone reads an item.
The <guid> change alone would be enough to make most aggregators choke and assume this is a different entry. SharpReader actually does a secondary check if no matching guid is found, where if at least 2 out of 3 between <link>, <title> and <description> are equal, it will still conclude this is the same item. Because this release changed the guid, link and description though, only the <title> remained the same and SharpReader therefore rules this to be a new item.

If you've ran into this problem, the best thing to do is to manually remove the old items (they should be further down in the headlines view in the natural ordering) and keep the new ones. While it may be easier to remove the new items instead, I would advise against this as you would lose any item-updates or comments as they would be made against new item only.