I'm looking for a fast way to verify a copy of a folder with 150Gigs of data, in 33 files. Some of the files are a few kb, while a few are 20-30Gigs. I've done a file count, which is quick, but doesn't verify that all the files are intact. I tried running md5sum on them, which works, but will probably take as long as copying the files in the first place. Diff works too, but is slow too.
What i am trying is to check the file duplication in a folder and remove a file if it is a duplicate of another file ie the contents are duplicate; but names may be same.
Basically i am using md5sum to calculate the md5sum values of each file and redirecting to a file. And i am thinking of comparing the md5sum values.But i am finding it hard to decide how to complete the code after redirecting the output of calculation of md5sum to a file.
I noticed something a little odd I'm hoping someone can enlighten me on. I noticed in a couple of cases that a package has the proper version, but differs in two regards.
1. The package ends up with a .el4 on the end of the version for Red Hat 4.
2. The actual MD5Sum of the files the package provides differ.
An example below:
Code:
[root@RH4ES32-MCE bin]# for i in `rpm -ql GConf2`;do md5sum $i;done; md5sum: /etc/gconf/2: Is a directory 9f90335546f7c57ae6fb552cc2b919c5 /etc/gconf/2/path md5sum: /etc/gconf/gconf.xml.defaults: Is a directory
[code].....
So my package changed slightly to now show .el4 versus just 2-2.8.1-1 I've indicated in the first output above that the first couple of lines differ. I stopped my comparison at that point as they truly are different.
In order to upgrade a machine that can not successfully upgrade to 10.4 I downloaded and burned the 10.04.1 iso image off the ubuntu alternate download site. In my first attempt I unsuccessfully burned the image with it failing at the very end. I did perform an md5sum on it and received the precise output I got from my second burn attenpt which DID complete successfully. Here is the output:
[code]...
I did research this last night and it seems the common wisdom was to reburn the iso (which I did twice) or copy down the iso again. This I also did and it came down precisely, bit for bit, the same as the first one. Here are the two cksums
[code]...
Is there something wrong with this image on the website or is the error about 1 file being unreadable (could that also mean missing?) be erroneous?
I want to move all files and directories that are 1 month old out to back up into a separate folder. There will be a lot of files and I want to make sure it copies properly. The problem I'm having is integrating a MD5SUM into it to check integrity. MD5SUM is not recursive, so I figured it would work in a loop when it copies each individual file, I'll do a md5sum on each file and delete that md5 once its verified it copied ok.
[Code]...
I also need some sort of error handling to output all md5's that didnt pass the hash check.
I'm having trouble understanding how to verify the download of the Fedora iso-files. know how to do this on a Windows system. I have been looking in the help section for checking the iso-files, but I'm not sure where to find the right hashes, like MD5, SHA1, and etc.
All of the discussion about slackware and kde prompted me to rsync alien's kde 4.6 packages (thanks for these by the way!).Each directory contains the .txz packages and associated .asc (all same signature)and md5s.I want to avoid doing gpg --verify whatever.asc individually for multiple files. Likewise for md5sum -c whatever.md5.Can anyone tell me a nice way of running run gpg --verify and md5sum on all files in the directory? I have been playing with wildcards but can't get it right.
I have a 60GB partition with / and home on it. I logged on yesterday and it gave me a warning saying that I had only 1.9 GB of disk space left. I ignored it for a day and assumed that i had too many videos and pics.But the next day i had not added any files or downloaded any software but i had 0B left. I used the disk usage analyser and found that 33GBs came from /var/log. It was from two log files. syslog and daemon.log 16.5GB each!! I opened them up and i found that this line of text was repeated hnundreds of thousands of times.
Code: Jul 22 19:32:36 aulenback-desktop ntfs-3g[5315]: Failed to decompress file: Value too large for defined data type
i have about 2 TB of 700mb avi files as data on disc want to spread it across two 2TB ext usb drives (sata 3.5 inside the housing) obviously i have to rip them to the laptop and then move to the ext hdd (omg laborious little task) am i better doing the ripping in meerkat or in a windows machine? files need to be accessible by W7, XP, and meerkat to vlc player. what should i format the discs to?
I have a wav file bigger than 8GB. i recorded it on a windows PC. unfortunately wav files cant be bigger than 2GB. somehow i got a file that is almost 9GB. I tried to chop the file under ubuntu into smaller pieces to open it part by part. i used gnome split to divide the file and made 10 parts out of it. now i have these parts of the data which i cant read with no program except for gnome split to merge them together again - which would only bring me to the beginning of my problem. so my question is: is there any other way to open/ split&open a wav file of that size or maybe a way to open the splitted file partially?
I've started using the huge.s kernel and when i try to compile packages slackware complains about kernel headers but all i see is the smp header files on the slackware discs ?
Is there any software in Linux to view huge .txt files, say, over 10 megas? I'm now using default "gedit", version 2.28.0, which seems to not be able to open huge .txt files. It's the same case for Windows default .txt browser, but in Windows, "Win Word" seems to work fine. software under Linux to browse huge .txt files?
syslog, messages and kern.log are incredibly huge files that are taking up a lot of space on my hard drive. Is it safe to remove them and/or to reduce logging so it doesn't take such an enormous amount of hard disk memory? If so, how can I reduce the logging so it doesn't produce logs that are 10s of GB in size?Also, mounting a drive places it into the folder /media. Will it become problematic if the size of the mounted drive exceeds the amount of free space available on my Ubuntu partition?
I started getting errors about running out of disk space in root this morning. I hunted up what's taking all the space; var/log is 39GB (Ubuntu is installed on a 50G partition.) It's specific files that live in that directory, not subfolders. The files are:
I do monthly reports by copying the previous document, update the text and change the images. The images are the same size and numbers each months. Since last month I upgraded my laptop to Natty and suddenly my document went from 942 kB to 10.1 MB in .odt. When saving to PDF the usual size of 472 went up to 1.9 MB. I have searched the net and the forums but haven't seen anything about a similar issue.
I'm not sure if it's an issue that is from the previous document being produced in Open Office and now updated and saved in Libreoffice. Or if it's somehow something to do with the upgrade from Maverick to Natty. I would hope I don't have to uninstall Libreoffice and install Open Office as a solution (which I understand is not entirely easy in Natty, something I read about Open Office being transitional to Libre). I can't email simple documents to customers that's over 10 MB large...
I streamed video through a my computer with mediatomb yesterday. The problem is that now, I got these huge log files. I am running out of memory (less than 1 gb left) as we speak. They're filled with ufw entries, but my question is:
I read somewhere about a program called logrotate that were supposed to keep logs from getting to big, is this wrong and should mediatomb generate 3 separate log files with 5gb of data each for just 2 hours of streaming?
I am facing a strange problem in my server, One of my filesystem shows as 3.1G when I execute df -h command and the utilization shows as 83%, but when I cd to the directory /usr/local I could not find any huge files in that filesystem and I have searched for hidden files as well,
groupserver:~ # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda9 3.1G 2.5G 532M 83% /usr/local groupserver:/usr/local # du -sh * 0 bin 93M abinav
My /var/ partition continues to fill up on all my servers, and it is because the logs in /var/log/apache2 or /var/log/mysql are being deleted during log rotate, but their file handles are being held open. Thus, a "du -sh /var/log" shows the correct values, but "df | grep /var" shows something much different.
It seems that the log files rotate, however if I run "lsof | grep deleted" it returns lots of files that are no longer visible in the directory, however refuse to clear themselves off the disk.
The only way I have found to make these log files go away (and thus clear up the disk space on the partition I should have) is to restart either apache or mysql, depending on which process has huge sized log files being held open.
Is it just me, or is this a big flaw in the way linux works, that it can't figure out how to release file handle for a log so the disk space can be reclaimed? This is happening to me a lot lately.
Here is some output from one of my web servers so you can see what I am seeing...
root@web49:~# df -h | grep /var$ Filesystem Size Used Avail Use% Mounted on /dev/sda8 9.2G 6.1G 2.7G 70% /var root@web49:~# du -sh /var
how to create the huge.s kernel files on the slackware disks? or at least direct me to a post if there is the same question. I currently rsync my files to Alien BOB's script, and i use syslinux to install from my usb stick. i was wanting to install using a later kernel just for testing purposes. (i.e 2.6.34-rc3 as of this writing)
I need to transfer some massive amount of data (2.5terrabyte, many files, directory structure) to a embedded raid-box which has a minimal linux on it (some custom distro from western digital). We tried rsync (version 2.6.7), but it crashes because the filelist is too big for the ram available (fixed in later versions of rsync, but I don't know how to update, it's not debian based and there are no compiler tools). We tried nfs, but the max bandwidth produced is around 1 mb/sec (cpu bound?), so it'd take around 3 weeks this way. Samba has problems with big files (and we have some 20gb files in there).
SCP isn't installed, and would probably also be cpu bound due to encryption I think. So the only option left would be ftp, we're currently trying using ncftp with the command "put -R /path/to/data/" , but it's been running for over an hour, eating up most of the ram, and not using any bandwidth. I think it is still building a file list or something. FTP already worked for a single 20gb file with acceptable bandwidth of about 12mb/sec. Does anyone know a better ftp program (for console) that can start transferring some data or at least display an estimated time for the copy-preperation?
I need to figure out how to arrange for the fastest-possible read-access of a large or huge memory-mapped file. I'm writing high-speed real-time object-chasing software for a NASA telescope (on earth). This software must detect images of fast moving objects (across arbitrary fields of fixed stars), estimate what direction and speed the object image is traveling (based on the length and direction of a streak on the detection image), then chase after the object while capturing new 4Kx4K pixel images every 2~5 seconds, quickly matching its speed and trajectory, then continue to track and capture images until the object vanishes (below horizon, into earth shadow, etc).
I have created two star "catalogs". Both contain the same 1+ billion stars (and other objects), but one is a "master catalog" that contains all known information about each object (128 bytes per object == 143GB) while the other is a "nightly build" that only contains the information necessary to perform the real-time process (32 bytes per object == 36GB) with object positions precisely updated for precession and proper-motion each night. Almost always the information in the "nightly build" catalog will be sufficient for the high-speed (real-time) processes.
Please why my scanning is always creating huge 50Mb to 100Mb PDF files ?Each A4 Pnm file is of 6.5Mbytes by resolution of 150.If I decrease the resolution lower than 100, then it starts to be unreadable my text ...