Programming :: Choice Of Programming Solutions For Collated Document Repository?
Feb 18, 2011
We have documents on multiple workstations and want to collate them into a single repository to provide text search and download. So far we have implemented rsync to copy files from each workstation under a directory for each workstation on a server (incidentally providing a backup) and have set up text search using Xapian with Omega; users access it via a web browser. Still to do is to set up a system to copy files from each workstation's area on the server to the repository.
Many files are duplicated. In these cases we want to preserve the names but keep a single copy of the file;hard links can be used for that.For each file to be copied from a workstation's area into the collated area we need to check whether it is a duplicate (file size and, if same, MD5 sum) and if so, create a hard link to the original rather than create a copy.A system to detect and replace duplicates in the collated area has been written using ruby and postgresql but the developer cannot commit to continuing this work. It does mean we have a postgresql database populated with "fingerprints" of files in the collated area.My first priority is to get the system working; in the longer term whatever is developed must be maintainable; I do not yet know which language skills are available locally.
I am fluent in bash and competent with awk. Ruby looks nice but I have started to learn python and do think it prudent to learn both at the same time. Python's postgresql capabilities are not settled but may be fine for the simple usage required.What to do? A bash solution would run very slowly but could be developed quickly. Language knowledge aside, I have found it difficult to install ruby on the server (CentOS 5.5;installed rvm but "gem" still not installed; seems a very complex system with its own package management).
I keep time sheet entries at work in an sqlite database called 'timesheet'. I have a shell script called 'today' which queries for all timesheet entries which are less than 24 hours old; it looks like this:
I am creating a document using Latex and I am using the existing article class "documentclass[twocolumn]{article}" The paper needs to have a two column format, but I have figures that I would like to include in a landscape orientation, possibly on a new page, (they appear too small even if I have them span either 1 or both columns).
It has happened with me before too! Few months back while working on Priority queue in C language, I successfully compiled the code and it gave the desired outputs, I showed the running code to my other seniors and committed it to the repository at 21:00 (I was in the office till late night), next day my boss asked to show the running code, and it didn't work!!! I updated from the repository whatever I had committed but still nothing worked? And then on this Saturday, my boss fixed a bug and committed to the repository and asked me to checkout which I did today morning, and when I ran the code it didn't work?
I have a local git repository that pushes to a remote repository. That remote repository moved to a new server. How do I make "git push" and "git pull" push/pull to/from the new repo?
I'm trying to replace an office file server. I would like to avoid just another samba share.
I'm looking for a document repository, a bit more functionality than a plain samba share and very cross-platform.
I've looked a couple minutes at dspace, but that seems like a lot of work just configuring it. Dropbox would be fine except that they only have up to 100g, and it's off-site.
This is NOT for unauthenticated public use.
Here are some features I have in mind:
1. Web front end. 2. Any file format from a one-line text document to a Microsoft Word document to an ISO of a blu-ray disk to a very large database backup, binary or text. 3. Cross-platform clients, mostly Mac. 4. Authenticated via centralized one-login server or maybe by a key such as an SSH public key. 5. Searchable by terms, name or content if the type is appropriate. 6. Pass in the URL for an object and have the server download it. 7. Stores files in native format so if the app breaks I can just get the files.
I am interested in learning 3D programming. The thing is, I would hate to put too much effort to learn something that doesn't have future and is dying. My favorite language at the time is Java. My goal is professional programming.
So I have several questions: 1. Should I learn JOGL or start learning C++ and do C++ openGL programming? 2. Is there a big difference between JOGL and C++ openGL programming? 3. Is it worth to learn openGL? Does it have a future? 4. Is it a big difference between openGL and directX coding? 5. If choosing Java, then JOGL or LWJGL?
how to get xsane to scan a document and have it display as a full 8.5x11 sized document instead of something half that size? I've been trying and trying and can't seem to figure it out.
Anyone have better documentation or an update to the this version of the file Tomcat HOWTO openSUSE as that document is referencing 10.2. Or a document for use with SLED.
Groovy is an object-oriented programming language for the Java platform. I do not have experience in Java, only perl and shell scripts. Recently I have been asked to maintain a software written in groovy (also to make enhancements). So can I learn groovy without knowing java language. or isit I have to learn java before venturing into groovy.
I did searched you tube but my results were not great.I have 2 books on KernelProgramming.I feel I need if some where I can get a video tutorial which can help me to understand how to develop a Linux Device driver that will be great.I had a look at Greg Kroah Hartmans video lecture of developing patches on ......I have been reading books and a lot of stuff.So I wish if I could get a video lecture that would be better
I have a server listening on incoming client connections. Once the client establishes SSL connection with the server, the server waits on read() from the client. Only Client can disconnect the connection. I want to have a timer in the server program to wait for x secs after read() and then disconnect the Client connection.
I want that I click with the mouse on the video, it paused.I notice that there is "BaconVideoWidget" which I guess is the video rendering widget but it don't have signal named "clicked":
i have a server program which accept multiple client connection and am using polling. like every 2 secs it will look to client whether any data is received after it binded. i have used setitimer but there is runtime error i got.. the server accept all client connection but doesn't execute any msg which client sent.
I am learning network programming via a book of Richard Stevens.The sample source codes are given here http://www.unpbook.com/unpv13e.tar.gz I downloaded and unzipped the file in /usr/src folder.As per the instructions given in README of downloaded archive I did.
A simple TCP based chat server could allow users to use any TCP client (telnet, for example) to communicate with each other. For this question you should consider a single process, single thread server that can support exactly 2 clients at once, the server simply forwards whatever is sent from one client to the other (in both directions). Your server must not insist on any specific ordering of messages as soon as something is sent from one client it is immediately forwarded to the other client. As soon as either client terminates the connection the server can exit
i have problem in socket programming, while displaying received message in file,i got a problem... i cant able to write it in the file.... this is the code....
now my problem is run time error i can able to create file but i cant able to write file....log.txt contain nothing.... as here i have give sample code... dont say not initialising function and all.... i have initialised , please only see func1() - my problem is only not able to write msg which i got received from the client..
I've been working in the real world for a year making some money to go back and finish my masters, and now I'm coming to the end of my contract and am realising how little i remember and how small my scope has become; i basically do shell scripts and perl these days, and its making me uneasy. So instead of bitching about it , I'm going to endevor to complete Project Euler Using randomly generated programming languages for each problem, selected from
And post the fruits of my attempts at my blog (shameless plug) aswell as opening up my svn repository when I'm done. (altho i need to ask in another thread about svn permissions....)To my shame i havent ever touched C#, JavaScript, Ruby or Python, so all in all its going to be very interesting how much i screw up.Anyone have any additional ideas, or languages I'm missing or such? I was considering TCL or Haskell or Erlang at a strech, but i dont know how useful these three would be.
I trying to write a UART(interfacing of serial devices) to linux machine but after I execute the following code to receive data I need to enter key (carriage return).... but I don't want to remove carriage return/enter key
i want a process that can operate as both a TCP echo server and a UDP echo server. The process can provide service to many clients at the same time, but involves a single process that does not start up any other threads.