|
| 1 | +Feedly |
| 2 | +------ |
| 3 | + |
| 4 | +|Build Status| |
| 5 | + |
| 6 | +**Note** |
| 7 | + |
| 8 | +The Feedly open source project is in no way related to feedly.com. To |
| 9 | +avoid confusion we are considering renaming the 1.0 release of the |
| 10 | +project. |
| 11 | + |
| 12 | +What can you build? |
| 13 | +------------------- |
| 14 | + |
| 15 | +Feedly allows you to build newsfeed and notification systems using |
| 16 | +Cassandra and/or Redis. Examples of what you can build are the Facebook |
| 17 | +newsfeed, your Twitter stream or your Pinterest following page. We've |
| 18 | +built Feedly for `Fashiolista <http://www.fashiolista.com/>`__ where it |
| 19 | +powers the `flat feed <http://www.fashiolista.com/feed/?feed_type=F>`__, |
| 20 | +`aggregated feed <http://www.fashiolista.com/feed/?feed_type=A>`__ and |
| 21 | +the `notification |
| 22 | +system <http://www.fashiolista.com/my_style/notification/>`__. (Feeds |
| 23 | +are also commonly called: Activity Streams, activity feeds, news |
| 24 | +streams.) |
| 25 | + |
| 26 | +[readme\_developing]: |
| 27 | +https://github.com/tschellenbach/Feedly/blob/master/README.md#developing-feedly |
| 28 | +To quickly make you acquainted with Feedly, we've created a Pinterest |
| 29 | +like example application, you can find it |
| 30 | +`here <https://github.com/tbarbugli/feedly_pin/>`__ |
| 31 | + |
| 32 | +**Authors** |
| 33 | + |
| 34 | +- Thierry Schellenbach |
| 35 | +- Tommaso Barbugli |
| 36 | +- Guyon Morée |
| 37 | + |
| 38 | +**Resources** |
| 39 | + |
| 40 | +- `Documentation <https://feedly.readthedocs.org/>`__ |
| 41 | +- `Bug Tracker <http://github.com/tschellenbach/Feedly/issues>`__ |
| 42 | +- `Code <http://github.com/tschellenbach/Feedly>`__ |
| 43 | +- `Mailing List <https://groups.google.com/group/feedly-python>`__ |
| 44 | +- `IRC <irc://irc.freenode.net/feedly-python>`__ (irc.freenode.net, |
| 45 | + #feedly-python) |
| 46 | +- `Travis CI <http://travis-ci.org/tschellenbach/Feedly/>`__ |
| 47 | + |
| 48 | +Using Feedly |
| 49 | +------------ |
| 50 | + |
| 51 | +This quick example will show you how to publish a Pin to all your |
| 52 | +followers. So lets create an activity for the item you just pinned. |
| 53 | + |
| 54 | +.. code:: python |
| 55 | +
|
| 56 | + def create_activity(pin): |
| 57 | + from feedly.activity import Activity |
| 58 | + activity = Activity( |
| 59 | + pin.user_id, |
| 60 | + PinVerb, |
| 61 | + pin.id, |
| 62 | + pin.influencer_id, |
| 63 | + time=make_naive(pin.created_at, pytz.utc), |
| 64 | + extra_context=dict(item_id=pin.item_id) |
| 65 | + ) |
| 66 | + return activity |
| 67 | +
|
| 68 | +Next up we want to start publishing this activity on several feeds. |
| 69 | +First of we want to insert it into your personal feed, and secondly into |
| 70 | +the feeds of all your followers. Lets start first by defining these |
| 71 | +feeds. |
| 72 | + |
| 73 | +.. code:: python |
| 74 | +
|
| 75 | + # setting up the feeds |
| 76 | +
|
| 77 | + class PinFeed(RedisFeed): |
| 78 | + key_format = 'feed:normal:%(user_id)s' |
| 79 | +
|
| 80 | + class UserPinFeed(PinFeed): |
| 81 | + key_format = 'feed:user:%(user_id)s' |
| 82 | +
|
| 83 | +Writing to these feeds is very simple. For instance to write to the feed |
| 84 | +of user 13 one would do |
| 85 | + |
| 86 | +.. code:: python |
| 87 | +
|
| 88 | +
|
| 89 | + feed = UserPinFeed(13) |
| 90 | + feed.add(activity) |
| 91 | +
|
| 92 | +But we don't want to publish to just one users feed. We want to publish |
| 93 | +to the feeds of all users which follow you. This action is called a |
| 94 | +fanout and is abstracted away in the Feedly manager class. We need to |
| 95 | +subclass the Feedly class and tell it how we can figure out which user |
| 96 | +follow us. |
| 97 | + |
| 98 | +.. code:: python |
| 99 | +
|
| 100 | +
|
| 101 | + class PinFeedly(Feedly): |
| 102 | + feed_classes = dict( |
| 103 | + normal=PinFeed, |
| 104 | + ) |
| 105 | + user_feed_class = UserPinFeed |
| 106 | + |
| 107 | + def add_pin(self, pin): |
| 108 | + activity = pin.create_activity() |
| 109 | + # add user activity adds it to the user feed, and starts the fanout |
| 110 | + self.add_user_activity(pin.user_id, activity) |
| 111 | +
|
| 112 | + def get_user_follower_ids(self, user_id): |
| 113 | + ids = Follow.objects.filter(target=user_id).values_list('user_id', flat=True) |
| 114 | + return {FanoutPriority.HIGH:ids} |
| 115 | + |
| 116 | + feedly = PinFeedly() |
| 117 | +
|
| 118 | +Now that the feedly class is setup broadcasting a pin becomes as easy as |
| 119 | + |
| 120 | +.. code:: python |
| 121 | +
|
| 122 | + feedly.add_pin(pin) |
| 123 | +
|
| 124 | +Calling this method wil insert the pin into your personal feed and into |
| 125 | +all the feeds of users which follow you. It does so by spawning many |
| 126 | +small tasks via Celery. In Django (or any other framework) you can now |
| 127 | +show the users feed. |
| 128 | + |
| 129 | +.. code:: python |
| 130 | +
|
| 131 | + # django example |
| 132 | +
|
| 133 | + @login_required |
| 134 | + def feed(request): |
| 135 | + ''' |
| 136 | + Items pinned by the people you follow |
| 137 | + ''' |
| 138 | + context = RequestContext(request) |
| 139 | + feed = feedly.get_feeds(request.user.id)['normal'] |
| 140 | + activities = list(feed[:25]) |
| 141 | + context['activities'] = activities |
| 142 | + response = render_to_response('core/feed.html', context) |
| 143 | + return response |
| 144 | +
|
| 145 | +This example only briefly covered how Feedly works. The full explanation |
| 146 | +can be found on read the docs. |
| 147 | + |
| 148 | +**Documentation** |
| 149 | + |
| 150 | +[Installing Feedly] [docs\_install] [docs\_install]: |
| 151 | +https://feedly.readthedocs.org/en/latest/installation.html [Settings] |
| 152 | +[docs\_settings] [docs\_settings]: |
| 153 | +https://feedly.readthedocs.org/en/latest/settings.html [Feedly (Feed |
| 154 | +manager class) implementation] [docs\_feedly] [docs\_feedly]: |
| 155 | +https://feedly.readthedocs.org/en/latest/feedly.feed\_managers.html#module-feedly.feed\_managers.base |
| 156 | +[Feed class implementation] [docs\_feed] [docs\_feed]: |
| 157 | +https://feedly.readthedocs.org/en/latest/feedly.feeds.html [Choosing the |
| 158 | +right storage backend] [docs\_storage\_backend] |
| 159 | +[docs\_storage\_backend]: |
| 160 | +https://feedly.readthedocs.org/en/latest/choosing\_a\_storage\_backend.html |
| 161 | +[Building notification systems] [docs\_notification\_systems] |
| 162 | +[docs\_notification\_systems]: |
| 163 | +https://feedly.readthedocs.org/en/latest/notification\_systems.html |
| 164 | + |
| 165 | +**Tutorials** |
| 166 | + |
| 167 | +[Pinterest style feed example app] [mellowmorning\_example] |
| 168 | +[mellowmorning\_example]: |
| 169 | +http://www.mellowmorning.com/2013/10/18/scalable-pinterest-tutorial-feedly-redis/ |
| 170 | + |
| 171 | +Feedly Design |
| 172 | +------------- |
| 173 | + |
| 174 | +*The first approach* |
| 175 | + |
| 176 | +A first feed solution usually looks something like this: |
| 177 | + |
| 178 | +.. code:: sql |
| 179 | +
|
| 180 | + SELECT * FROM tweets |
| 181 | + JOIN follow ON (follow.target_id = tweet.user_id) |
| 182 | + WHERE follow.user_id = 13 |
| 183 | +
|
| 184 | +This works in the beginning, and with a well tuned database will keep on |
| 185 | +working nicely for quite some time. However at some point the load |
| 186 | +becomes too much and this approach falls apart. Unfortunately it's very |
| 187 | +hard to split up the tweets in a meaningfull way. You could split it up |
| 188 | +by date or user, but every query will still hit many of your shards. |
| 189 | +Eventually this system collapses, read more about this in `Facebook's |
| 190 | +presentation <http://www.infoq.com/presentations/Facebook-Software-Stack>`__. |
| 191 | + |
| 192 | +*Push or Push/Pull* In general there are two similar solutions to this |
| 193 | +problem. |
| 194 | + |
| 195 | +In the push approach you publish your activity (ie a tweet on twitter) |
| 196 | +to all of your followers. So basically you create a small list per user |
| 197 | +to which you insert the activities created by the people they follow. |
| 198 | +This involves a huge number of writes, but reads are really fast they |
| 199 | +can easily be sharded. |
| 200 | + |
| 201 | +For the push/pull approach you implement the push based systems for a |
| 202 | +subset of your users. At Fashiolista for instance we used to have a push |
| 203 | +based approach for active users. For inactive users we only kept a small |
| 204 | +feed and eventually used a fallback to the database when we ran out of |
| 205 | +results. |
| 206 | + |
| 207 | +**Features** |
| 208 | + |
| 209 | +Feedly uses celery and Redis/Cassandra to build a system with heavy |
| 210 | +writes and extremely light reads. It features: |
| 211 | + |
| 212 | +- Asynchronous tasks (All the heavy lifting happens in the background, |
| 213 | + your users don't wait for it) |
| 214 | +- Reusable components (You will need to make tradeoffs based on your |
| 215 | + use cases, Feedly doesnt get in your way) |
| 216 | +- Full Cassandra and Redis support |
| 217 | +- The Cassandra storage uses the new CQL3 and Python-Driver packages, |
| 218 | + which give you access to the latest Cassandra features. |
| 219 | +- Built for the extremely performant Cassandra 2.0 |
| 220 | + |
| 221 | +**Feedly** |
| 222 | + |
| 223 | +Feedly allows you to easily use Cassndra/Redis and Celery (an awesome |
| 224 | +task broker) to build infinitely scalable feeds. The high level |
| 225 | +functionality is located in 4 classes. |
| 226 | + |
| 227 | +- Activities |
| 228 | +- Feeds |
| 229 | +- Feed managers (Feedly) |
| 230 | +- Aggregators |
| 231 | + |
| 232 | +*Activities* are the blocks of content which are stored in a feed. It |
| 233 | +follows the nomenclatura from the [activity stream spec] [astream] |
| 234 | +[astream]: http://activitystrea.ms/specs/atom/1.0/#activity.summary |
| 235 | +Every activity therefor stores at least: |
| 236 | + |
| 237 | +- Time (the time of the activity) |
| 238 | +- Verb (the action, ie loved, liked, followed) |
| 239 | +- Actor (the user id doing the action) |
| 240 | +- Object (the object the action is related to) |
| 241 | +- Extra context (Used for whatever else you need to store at the |
| 242 | + activity level) |
| 243 | + |
| 244 | +Optionally you can also add a target (which is best explained in the |
| 245 | +activity docs) |
| 246 | + |
| 247 | +*Feeds* are sorted containers of activities. You can easily add and |
| 248 | +remove activities from them. |
| 249 | + |
| 250 | +*Feedly* classes (feed managers) handle the logic used in addressing the |
| 251 | +feed objects. They handle the complex bits of fanning out to all your |
| 252 | +followers when you create a new object (such as a tweet). |
| 253 | + |
| 254 | +In addition there are several utility classes which you will encounter |
| 255 | + |
| 256 | +- Serializers (classes handling serialization of Activity objects) |
| 257 | +- Aggregators (utility classes for creating smart/computed feeds based |
| 258 | + on algorithms) |
| 259 | +- Timeline Storage (cassandra or redis specific storage functions for |
| 260 | + sorted storage) |
| 261 | +- Activity Storage (cassandra or redis specific storage for hash/dict |
| 262 | + based storage) |
| 263 | + |
| 264 | +Background Articles |
| 265 | +------------------- |
| 266 | + |
| 267 | +A lot has been written about the best approaches to building feed based |
| 268 | +systems. Here's a collection on some of the talks: |
| 269 | + |
| 270 | +`Twitter |
| 271 | +2013 <http://highscalability.com/blog/2013/7/8/the-architecture-twitter-uses-to-deal-with-150m-active-users.html>`__ |
| 272 | +Redis based, database fallback, very similar to Fashiolista's old |
| 273 | +approach. |
| 274 | + |
| 275 | +`Etsy feed |
| 276 | +scaling <http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture/>`__ |
| 277 | +(Gearman, separate scoring and aggregation steps, rollups - aggregation |
| 278 | +part two) |
| 279 | + |
| 280 | +`Facebook |
| 281 | +history <http://www.infoq.com/presentations/Facebook-Software-Stack>`__ |
| 282 | + |
| 283 | +[Django project, with good naming conventions.] [djproject] [djproject]: |
| 284 | +http://justquick.github.com/django-activity-stream/ |
| 285 | +http://activitystrea.ms/specs/atom/1.0/ (actor, verb, object, target) |
| 286 | + |
| 287 | +`Quora post on best |
| 288 | +practises <http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-Feed?q=news+feeds>`__ |
| 289 | + |
| 290 | +`Quora scaling a social network |
| 291 | +feed <http://www.quora.com/What-are-the-scaling-issues-to-keep-in-mind-while-developing-a-social-network-feed>`__ |
| 292 | + |
| 293 | +`Redis ruby |
| 294 | +example <http://blog.waxman.me/how-to-build-a-fast-news-feed-in-redis>`__ |
| 295 | + |
| 296 | +`FriendFeed |
| 297 | +approach <http://backchannel.org/blog/friendfeed-schemaless-mysql>`__ |
| 298 | + |
| 299 | +`Thoonk setup <http://blog.thoonk.com/>`__ |
| 300 | + |
| 301 | +`Yahoo Research |
| 302 | +Paper <http://research.yahoo.com/files/sigmod278-silberstein.pdf>`__ |
| 303 | + |
| 304 | +`Twitter’s approach <http://www.slideshare.net/nkallen/q-con-3770885>`__ |
| 305 | + |
| 306 | +`Cassandra at |
| 307 | +Instagram <http://planetcassandra.org/blog/post/instagram-making-the-switch-to-cassandra-from-redis-75-instasavings>`__ |
| 308 | + |
| 309 | +.. |Build Status| image:: https://travis-ci.org/tschellenbach/Feedly.png?branch=master |
| 310 | + :target: https://travis-ci.org/tschellenbach/Feedly |
0 commit comments