General :: Merging Files With Different Number Of Rows Using Awk?
Apr 18, 2011
Does anyone have a solution for merging files if the number of rows in the two (or more) files is non-equivalent.To exemplify, how about merging the following 3 files:
I need to write shell script which can take number of files and count total rows from all CSVs and display total number of rows counted in all files. Is there any possibility of doing that using shell script and if yes then how.
anyway i need to take the average of all rows with the same 1st number(key). i.e.
1, 3, 4.66, 5.66, 5 2, 3.5, 4.5, 5, 3.5
I know this is something awk/sed would be great for, i just dont have enough experience with them to accomplish it, Also, what about averaging those columns together? so, after I output this to a file, id like to get another like:
1, 4.58 1, 4.125
The number of colums to add might not always be 4 either. EDIT: this might be easier to do in gnuplot, so I mainly just need an answer to the first part.
I'm not that familiar with sed and awk in order to be able to solve this problem on my own, so I'm calling on you for a bit of assistance. I'm writing a Nagios plugin to check our Oracle tablespaces and the output is given in one line like this: 1.04007771 TEMP 0 UNDOTBS1 .005340579 USERS 0 7 rows selected. I've been playing around with sed like below to delete the obsolete info and change every second space into a newline:
[code]...
how many tablespaces there are so I'd have to check all databases and 'hardcode' the tablespaces in my script. Is there any way to 'automate' this knowing that 'rows selected' preceded by a number is always the last line and using a sort of counter to auto-adjust the number to put in the -e 's/ / /2' part?
if there's a tab-delimited file under /usr/desktop, how can I determine the number of rows and columns of the file in shell?And, if told the the 3rd column of the file contains only numerical values and all values in the 5th column are unique, how can I verify these in shell?
i am totally new to terminal!!! i extracted both the audio and video from an mkv file. both can be played back without any problem, yet i can't merge them into a single one
I'm totally new to Linux and this website. I was wondering if anyone had or could help me create a shell script that would merge two files from two different directories and then have that new merged file in a third differnt directory.The merged file would need to eliminate duplicates and sort the contents.
In linux terminal; how can we get the number of rows ad columns from linux kernel? I tried from environment variables(LINES,COLUMNS) but, I could not retrieve them as my editor program is a child process to linux terminal process.
I have a bunch of text logfiles in the following format:
ID (17 characters) Timestamp (14 characters YYYYmmddHHMMSS e.g. "20060210100040" -> 2006/02/10 10:00:40) Random data (? characters) end of line
The files are already sorted by timestamp.I need to get 1 log file with all the logs from multiple logs files, sorted by timestamp. Note that the log files are really huge, around 3-4G each (and there are dozens of them) I tried the following command:
I would like to know which software can merge different videofiles (mpg, avi) into 1 file. Kino makes a DV-file, which is to big.So I search the equivalent of Microsoft Movie Maker.
I have a latex code file which links itself to many other latex files. The syntax is as follows:input{*path of file to be inputed*}The path is relative to the current working directory, so if my file is stored in /home/kevin/mybook.tex and I want to include a file in /home/kevin/latexstuff/copyleft.tex I simply write:input{latexstuff/copyleft.tex}
The latex compiler includes these files just as if they had been copied and pasted into the main latex file at the point specified. My problem is that I have a document which depends on quite a bit of these input commands,ut I am trying to use a latex preprocessor (ratexdb, adds database fields to your latex documentshich does not support input commands, leaving my file only half processed.So I was wondering: is there any easy way to parse through my main file, detect only the input commands, interpret the syntax and include the files specified (where specified)nd produce a second, populated file, which can then be processed by ratexdb?
During the past eight years I've used a number of computers with different operating systems and browsers. On each one I made a habit of using the bookmark utility of each browser and saving the bookmarks file. I never ensured the continuity of the bookmark file - with each new computer I started a new bookmark file. Even when I was reinstalling the operating system I didn't import the old bookmark file in the newly installed browser: I've always started a new bookmark file. As a result I have tens of bookmark files for Firefox (json format - some kind of xml?) and IE (html file format?) each one containing hundreds or thousands of saved links. I have also some files containing links in text format (created usually when I was using someone else computer).
I would like to be able to manage this bookmarks files by using some sort of "bookmark manager" software. The "bookmark manager" should be able to merge the bookmark files into a single collection/file. It should be able to identify and remove the duplicate entries (I have timely versions of the same bookmark file) and also it should be able to group the entries/links on categories (for example the bookmarked articles on codeproject.com should be grouped under the codeproject category). Not to mention that it should provide a search facility to quickly locate the interesting bookmarks. I couldn't find such software in ubuntu software center. Do you know something that even comes close to what I need?
I have a problem - I have files with rows of data and I need to check if the next row (of the same type) has the NEXT date in it so I need to extract a date in YYYYMMDD format from a row (easy enough) then add one day to it and compare it to the the next date I encounter on a subsequent row.
I've been hitting my head against a wall for awhile with this one:As the last part of some data analysis I performing I would to construct a matrix from a series of different files. These files have the format:
I would like to ask you if there is any maximum allowed number of files per folder in linux (without risking it to lose everything). I am using openuse 11.4 with latest kde (4.6?).
I am trying something fast and dirty and it might be that one folder will contain like 10^6 files.
Is there is anything I should be warned about that?
i need to know how to find number of files in a directory? is there any system calls in fedora 12.And i need to know how to perform a operation if the that count increases by one?
I've a directory containing around 2.8 lacs of files. I want to move them to another directory.If I use cp or mv then I get an error 'argument list too long'. If I write a script like
for file in ls *; do cp {source} to {destination} done
then because of ls command , its performance degrades.How can I do this?
I am facing problem in copying a large number of file 18 lakh (18,000,000) files from my personal hardisk to another hardisk each file is very small and size of folder is around 3.95 GB copying files using copy given by Windows is frustrating and I am not even able to compress file its giving me error that its not readable.And problem is I am not able to open this drive in Linux it showing me error there saying do diskchk in Windows and Windows disk check is also not able to repair this drive and goes into some mode unsolvable.Is there any way to open disk with error to open in Windows and if not any way I can copy data faster?ERROR: Disk labled EDU is corrupt go to windows and chkdsk /f there and reboot into window 2 times.
ulimit -a tells me I have a limit of 1024 open files, which is the default on my distro. Is there a way to show how many of these are currently used, or how many are remaining?
Is there any Linux application for finding the folders with the most number of files? baobab sorts folders by their total size, I'm looking for a tool that lists folders by the total number of files in it.
The reason I'm looking is because copying tens of thousands of small files is excruciatingly slow (much slower than copying a few large files of the same size), so I want to archive or delete those folders with high file counts that that will be slowing down the copying (it won't speed things up now, but it would be faster when I need to move/copy it again in the future).
I understand that chroot is usually used to provide security, however, for my issue, security is a big don't care. I am very new to using chroot and don't fully understand how the chroot'd env works.
problem: Trying to use a vendor supplied cross compile environment. The environment runs as a chroot'd env and works just fine. I have a large number of additional modules that I wish to compile in the chroot'd environment. FYI, these modules are also (succesfully) compiled for other targets not using chroot'd env's. Copying the source files into the the chroot environment is not an option (don't have hours to wait for copies to finish and it would break the make system). Having them live in the environment is also not an option (the chroot build is a tiny part of the build process and we cannot revamp our entire source tree to accommodate it).
I am looking for a way to have the compiler in the chroot'd env have access to a path that is outside of the env and typically higher up in the same path that holds the chroot'd env. I have tried soft links (they don't work as expected). Hard links only work for single files and there are 10's of thousands of files that would need to be linked. I am not sure how I would go about exporting the additional files and then mounting the exported files in the chroot'd env (or if that would even work).
I'm looking for a way to produce a list of all the directories in the current working directory sorted by the total number of files that are contained with them.
Initially I though that Nautilus could be used for this, but then I realised it doesn't count files in the sub directories.
The best I've got for a command line solution so far is this
Code:
The use case for this is a situation where a user has a quota applied to their home directory which limits the number of files they are allowed to have and they have exceeded that limit.
What command will provide you with the number of files in your current directory? Choose one answer. A. ls -c B. ls | wc -w (this one) C. ls -n | count D. ls -wc (this one ?)