General :: GUI Application For Finding / Deleting Duplicate Data Files
Jan 10, 2011
I copied a back up of my windows 'my documents' fold and all of its' sub folders into my linux (Mint Debian) Documents directory. I found that many of my files can be found in more that one directory so, what I want to do is to find all the dups and deal with them. Is there a good linux application to resolve this 'duplicates' problem. (I don't want to touch the linux system files.)
Is there an application that will take the data from an Open Office spread sheet and locate and mark each address on a map?I've got a long list that I need to sort and locate by zip codes.
I want to find and remove duplicate consecutive words from a text file. I've tried working with array but is very difficult..then i've tried using sed...somebody hint me with this sed : sed ':f;N;$!bf; s/(.*) 1/1 /g; s/(.*)1/1/g'. It works fine but if i have 3 consecutive identical words it only remove first one and the last two remain intact.
I have a text file that contains many lines that look like this:
I'm trying to make my script read this text file, find the string sequence "QIEN", and delete everything from this sequence backwards (including "QIEN") so that the above lines look something like this:
I'm aware that grep is good to do a regular find-and-delete as follows:
Code:
But this will delete everything on the lines that contain the string sequence "QIEN".
i waas wondering if anyone knew of a script or program that removes duplicate words in a txt file. im making an install script and the install list has gotten a bit long so i want to ensure there are no duplicates in the file
This simple task is proving harder then imagined. I have a multi-level directory that I'm trying to clean of duplicates, but I can't get 'find' to print what I need to see. To give an illustrative example, here is a dir:
Code: stuart@stuart:~/testdir$ ls * dir1: level2: dir1
So the output of find as i'd like it to work would show the two locations of dir1, which would be ./dir1 and ./level2/dir1. But no:
Code: stuart@stuart:~/testdir$ ls -d */ | head -1 | find . "`cat`" -type d . ./level2 ./level2/dir1 ./dir1 dir1/
We have a huge amount of duplicate files in a folder and I would like some pointers on to writing a bash script to create a list of the duplicate files. I've seen examples that check for the md5 sum of files... but I dont need that, the file name is enough.
What i am trying is to check the file duplication in a folder and remove a file if it is a duplicate of another file ie the contents are duplicate; but names may be same.
Basically i am using md5sum to calculate the md5sum values of each file and redirecting to a file. And i am thinking of comparing the md5sum values.But i am finding it hard to decide how to complete the code after redirecting the output of calculation of md5sum to a file.
I have two folders - Folder abc and Folder xyz which contains 1000's of files with few of them having the same file names. How can I remove the duplicates from Folder abc?
I have the URL of some streaming audio. How do I discover the data type or other details so that I can use console terminal tools to record or otherwise rip the audio?I am trying to time-shift record the stream from a local radio station much like one does with a DVR or TiVO for television programs.
I'm using a mac, and just transferred a bunch o photos from another computer, and as it turns out, there is a bunch of duplicates.I'm not too familiar with the mac terminal, but if there is a solution for linux, it will probably work for the mac.Just need to be able to recursively scan all folders in my Pictures folder and then Delete them.
What kind of method to find the duplicates files on linux,
1.how to find just using the file name, sometimes i figure out people often to copy their files to another directory and i want to find out if there any same file name in the linux box.
2. what about if i want to find the duplicate files based on contents of the file, example is in picture file if users store picture files from digital camera first they just save the file name in default but when they want to give that picture to others they will rename it, i've been used method md5 for this situation in python script but it takes long time
I'm asking this question just to know to use bash script a lot in work and i want to test out fdupes at home, is fdupes use similar md5 scan to find duplicate files?
I'm trying to write a shell script which finds bits of data from a text file. at the moment i'm using grep and basically i need a function which will look through the text file and take the data out of it. the file has days, months, years etc and i want it so i can type feb 06 and it finds all of the data for feb 06.
the problem i have is i can type feb and all the information comes back for feb, but i can't get it more precise e.g. feb 2009 and it finds just feb 2009, it seems to ignore that latter half. I've tried experimenting with egrep and having two inputs but i can't seem to fuse them together, it only takes the first input.
Is there a way to find out the currently installed packages and the corresponding command line to launch the package from a terminal. For example, I know that I have openoffice installed but I do not know how to find the command line to launch it.
I have a directory containing a ton of photos, some of which are duplicates but just with different names. Is there any way in linux to find all the duplicates and remove all of them except the most recent version? I know on Windows there are utilities that will do this through a GUI, but I'm using Linux through the CLI only.
my website was recently infected with this stupid malware lotultimatebet .cn...there is a hidden iframe on almost all the 6000 pages of my site.can you please advise, if it is possible for me to remove this line from all pages by executing some command?
I'm writing a Perl script which performs linux commands.I have a directory with a load of files.
Code:exec_cmd('rm $(ls * | grep -v file1)'); This command will delete all except for file1. How can I modify this to delete all files except for file1 & file2?
I had a program run riot and it has created hundreds of spurious files in one directory. Fortunately they are all dated 4th November so are easily identified. What bash command can I use from the console to delete them all?
I'm using the command below to sync two directories. Problem is insted of deleting the files on the target directory it simply appends a ~ character at the end of the file name. Not sure why this is happening?I'd like to have all deletes on the source replicated on target.
on my site now I'm using cache and need me cron script that will delete files older than 1 hour I have feature in my kloxo control panel just need me the command .
Is there anyway to delete certain paragraphs within a text file and then insert the paragraph into another text file.I just cannot figure out how to remove the specific lines from the file and then insert them into another file at a certain line within that new file. Thanks again
I know that rm -i will prompt wether you want to delete each file.But rm -i -r will prompt for each file in each subdirectory recursively. How to make it prompt just for the directory itself, and then delete its contents without asking?How to delete all the files in a directory without deleting . and ..?How to recursively delete all tilde files in a directory?How to GUI file managers delete files to Trash? Where is this "Trash" located? Can you delete to trash in the command line?
The privileges for root for UMESH are as below: [root@localhost media]# ll total 4 drwxr-xr-x 3 vickey root 4096 1970-01-01 05:30 UMESH [root@localhost media]#
We have an rsync cron job set up to mirror all the files in a "..dashtdocsdocs" folder to the same folder on another server. It copies all the files over correctly and deletes any files in the "docs" directory that aren't in the sending directory, but it also deletes any files we put in the target directory's parent folder (..dashtdocs or other subfolders like ..dashtdocsimages) even though they've been excluded in the .rsync-filter file.
So for example server A has ..dashtdocsdocs and ..dashtdocsimages. Server B has ..dashtdocsdocs but if I manually copy the images folder over to ..dashtdocsimages, the images folder gets deleted from the target directory every time rsync runs.
I'd like to keep just the docs directory synched and update other folders manually, but they keep getting deleted. It looks to me like it's running a delete-excluded option, but that option wasn't used.
I have a question, that may sound silly. I have removed VirtualBox from my Ubuntu install. But the .VirtualBox folder is still existing with a virtual drive of nearly 10 GB. Can I manually remove the folder .VirtualBox with rm -rf without any unwanted side effects?