avoid duplicate files in courses

classic Classic list List threaded Threaded
4 messages Options
Eduardo Ponce de León Carreto Eduardo Ponce de León Carreto
Reply | Threaded
Open this post in threaded view
|

avoid duplicate files in courses

does anybody know how can I avoid duplicate files... I have a file that is used in 100+ courses...If i upload that file to every course that would mean a lot of space for those 100files. Is there a way to avoid this. I am asking this because we are planning on moving from blackboard to sakaii as content system in blackboard is way too expensive. Is this possible in sakai?
_______________________________________________
management mailing list
[hidden email]
http://collab.sakaiproject.org/mailman/listinfo/management

TO UNSUBSCRIBE: send email to [hidden email] with a subject of "unsubscribe"
Matthew Jones Matthew Jones
Reply | Threaded
Open this post in threaded view
|

Re: avoid duplicate files in courses

There's nothing built into Sakai that does this automatically. There have been some feature requests that might have gotten part of the way to making this happen, like SAK-12340 [1] (Create a resource-type that functions as a "symbolic link" or "alias" for another resource) but it was marked as a Won't Fix a while ago. To get it to work you'd have to change the content hosting back-end to keep track of unique files and make symbolic links on the file system when identical files are created. Not sure which algorithm you'd go with for that. This process is somewhat described in [2] but not implemented at all in Sakai.

Typically if you need this, what you can do this is either get some hardware or software that does this for you. (Which there is a lot of) The most popular commercial hardware are probably the NetApp Filers which have the DeDup technology built into their NAS. [3] But there are also a bunch of other vendors listed on [2]. The open source filesystem link listed here, LessFS [4] looks like it's a pretty active product that (passed 1.0 release) could give you this type of functionality is of high importance to you. With Sakai you can store all of your binary content on the file system [5], and use fuse running on linux with a LessFS formatted partition.

Probably more work than you were expecting but it looks like the best you'll get easily.

I'm not sure if deduplication is in the plan for Sakai 3 or not.


-Matthew


On Tue, Mar 16, 2010 at 7:57 PM, Eduardo Ponce de León Carreto <[hidden email]> wrote:
does anybody know how can I avoid duplicate files... I have a file that is used in 100+ courses...If i upload that file to every course that would mean a lot of space for those 100files. Is there a way to avoid this. I am asking this because we are planning on moving from blackboard to sakaii as content system in blackboard is way too expensive. Is this possible in sakai?

_______________________________________________
management mailing list
[hidden email]
http://collab.sakaiproject.org/mailman/listinfo/management

TO UNSUBSCRIBE: send email to [hidden email] with a subject of "unsubscribe"


_______________________________________________
management mailing list
[hidden email]
http://collab.sakaiproject.org/mailman/listinfo/management

TO UNSUBSCRIBE: send email to [hidden email] with a subject of "unsubscribe"
Steve Swinsburg-3 Steve Swinsburg-3
Reply | Threaded
Open this post in threaded view
|

Re: avoid duplicate files in courses

I just did some experimenting, and have a simple solution that may work. This assumes it's ok for the files to be public. If that is not ok, then this won't work for you.

1. Upload your files somewhere and get the URLs to them. You could put them in a site that is specifically setup just to host them perhaps.
2. In the other sites that need the files, in Resources, add a Web Link and copy the URL in.
3. In those sites, use the URL to the web link that is in that site. This will cascade through to the original file.

I have tested this in various tools and it works, and means you have your files in a central location, and simply linked in each site. 

This could all be scripted pretty easily either via a script that talks WebDAV or via the web services.

cheers,
Steve



On 17/03/2010, at 11:25 AM, Matthew Jones wrote:

There's nothing built into Sakai that does this automatically. There have been some feature requests that might have gotten part of the way to making this happen, like SAK-12340 [1] (Create a resource-type that functions as a "symbolic link" or "alias" for another resource) but it was marked as a Won't Fix a while ago. To get it to work you'd have to change the content hosting back-end to keep track of unique files and make symbolic links on the file system when identical files are created. Not sure which algorithm you'd go with for that. This process is somewhat described in [2] but not implemented at all in Sakai.

Typically if you need this, what you can do this is either get some hardware or software that does this for you. (Which there is a lot of) The most popular commercial hardware are probably the NetApp Filers which have the DeDup technology built into their NAS. [3] But there are also a bunch of other vendors listed on [2]. The open source filesystem link listed here, LessFS [4] looks like it's a pretty active product that (passed 1.0 release) could give you this type of functionality is of high importance to you. With Sakai you can store all of your binary content on the file system [5], and use fuse running on linux with a LessFS formatted partition.

Probably more work than you were expecting but it looks like the best you'll get easily.

I'm not sure if deduplication is in the plan for Sakai 3 or not.


-Matthew


On Tue, Mar 16, 2010 at 7:57 PM, Eduardo Ponce de León Carreto <[hidden email]> wrote:
does anybody know how can I avoid duplicate files... I have a file that is used in 100+ courses...If i upload that file to every course that would mean a lot of space for those 100files. Is there a way to avoid this. I am asking this because we are planning on moving from blackboard to sakaii as content system in blackboard is way too expensive. Is this possible in sakai?

_______________________________________________
management mailing list
[hidden email]
http://collab.sakaiproject.org/mailman/listinfo/management

TO UNSUBSCRIBE: send email to [hidden email] with a subject of "unsubscribe"

_______________________________________________
management mailing list
[hidden email]
http://collab.sakaiproject.org/mailman/listinfo/management

TO UNSUBSCRIBE: send email to [hidden email] with a subject of "unsubscribe"


_______________________________________________
management mailing list
[hidden email]
http://collab.sakaiproject.org/mailman/listinfo/management

TO UNSUBSCRIBE: send email to [hidden email] with a subject of "unsubscribe"

smime.p7s (4K) Download Attachment
John Norman John Norman
Reply | Threaded
Open this post in threaded view
|

Re: avoid duplicate files in courses

In reply to this post by Matthew Jones

On 17 Mar 2010, at 00:25, Matthew Jones wrote:

I'm not sure if deduplication is in the plan for Sakai 3 or not.

Deduplication is a primary requirement of content management in Sakai 3.

John

_______________________________________________
management mailing list
[hidden email]
http://collab.sakaiproject.org/mailman/listinfo/management

TO UNSUBSCRIBE: send email to [hidden email] with a subject of "unsubscribe"