River4 is a node.js river-of-news aggregator that stores its lists and data in Amazon S3.
####Overview
We have a press backgrounder for River4 here. If you're wondering what it is, or why it's significant, this is the first place to go.
If you need help, we have a support mail list, with people who have successfully set up and are running River4 installations. If you're having trouble, this is the place to go.
If you're ready to install the software, you've come to the right place! :-)
-
A node.js installation.
-
An Amazon account, and an S3 bucket to store the JSON files, and a small HTML file.
-
One or more OPML subscription list files.
-
Create an S3 bucket to hold all your subscription lists, rivers, and data for the aggregator.
-
On the node.js system, set an environment variable, s3path, to contain the path to the bucket created in step 1.
export s3path=/river.mydomain.com/
-
Again, on the node.js system, set the two AWS environment variables. This allows the River4 app to write to your bucket.
export AWS_ACCESS_KEY_ID=12345
export AWS_SECRET_ACCESS_KEY=TUVWXYZ
-
Launch river4.js on a node.js system. Suppose that server is aggregator.mydomain.com.
-
Look in the bucket. You should see a data folder, with a single file in it containing the default value of prefs and stats for the app. There's also an index.html file, which will display your rivers in a simple way, providing code you can crib to create your own way of browsing (room for improvement here, for sure).
-
Create a folder at the top level of the bucket called "lists". Save one or more OPML subscription lists into that folder.
-
After a while you should see a new folder called "rivers" created automatically by the software. In that folder you should see one JSON file for each list. It contains the news from those feeds, discovered by River4. This format is designed to plug into the beautfiul" river displayer.
-
If you want to watch the progress of the aggregator, you can view this page.
-
I edit code in an outliner, which is then turned into JavaScript. The "opml" folder in the repository contains the versions of the code that I edit. The comments are stripped out of the code before it's converted to raw JS, so there is information for developers in the OPML that isn't in the main files (though all the running code is in both).
-
The first released version is 0.79. They will increment by one one-hundredth every release. At some point I'll call it 1.0, then subsequent releases will be 1.01, 1.02 etc.
-
When you set up your S3 bucket, make sure that web hosting is enabled and index.html is the name of your index file. Here's a screen shot that shows how to set it up.
-
Heroku How To -- get a Heroku server running with Fargo Publisher, the back-end for Fargo.
-
Bare-bones Heroku do -- checklist for setting up a Heroku server running Node.js from a Mac desktop.
-
The River4 support mail list.
-
Chris Dadswell wrote a tutorial for setting up your own River4 installation.
Thanks to two developer friends, Dan MacTough and Eric Kidd, who helped this Node.js newbie get this app up and running.
Specifically thanks to Dan for writing the excellent feedparser and opmlparser packages that are incorporated in River4.
This version can be configured to store its data in the local filesystem instead of S3. See the blog post for details.
New /ping endpoint, available to be called by a publisher, on behalf of a user, to indicate that a feed has updated, and should be read immediately. Radio3 has this facility as of today, as does Fargo.
Fixed a problem that caused rivers to display only old stories. Full explanation on the blog.
Added more fields to the struct the /status call returns. It now says what the s3path is, what port the server is running on, and if you've defined a s3defaultAcl (see v0.91) what the value of that parameter is.
A new environment variable, s3defaultAcl, if present specifies the permissions on S3 files we create. The default is public-read. With this parameter, it may be possible to run a private installation of River4.
New <source:outline> elements flow through River4. See the docs for the source namespace for details.
One small change to package.json, and no changes to the JavaScript code.
A subscription list can now contain an include node, so you can have a list of lists. Full explanation in this blog post.
Changed the package.json file to require Node v0.8.x. Previously it was 0.6.x. This should make it possible to deploy on Nodejitsu without modification, per Dave Seidel's report.
Fixed a bug that would cause River4 to crash when processing an item with a null title.
Fixed a bug that would cause River4 to crash when reading an item from a subscription list that didn't have an xmlUrl attribute.
Two fixes, explained here.
Two fixes, explained here.
Now if there's an error in any JSON code we try to parse, we display an error message in the console, along with the path to the S3 file we were trying to read.
serverData.stats now has a copy of the last story added to the river. The dashboard page displays it.
New "dashboard" feature. If your server is running at aggregator.mydomain.com, if you go to:
http://aggregator.mydomain.com/dashboard
You'll get a real-time readout of what your aggregator is doing.
The HTML source for the dashboard page is in dashboard.opml in the opml folder in the repository.