I'm trying to search through some pdf files and I'm doing so by converting them to text files using pdftotext which is fine but I'm trying to get the number of occurrences in a paragraph of different words and it's adding a new line character at what it thinks is the right hand margin. I'm trying to remove all these singe new line characters but keep the doubles and I can't seem to work it out. i.e.
This is some text that has been broken.
Another paragraph.
becomes
This is some text that has been broken.
Another paragraph
I want to remove duplicate or multiple similar lines from multiple files. I.e. if I have four files file1.txt file2.txt file3.txt and file4.txt and would like to find and remove similar lines from all these files keeping only one line from these similar lines. I only that uniq can be used to remove similar lines from a sorted file.
I have a file witch I need to list 10 line by 10 lines with something like press enter to go on in between. Well, the problem is that i have absolutely no idea on how to implement this.
Using awk I pull the first field of a random line from my datafile.myvar1=`awk -F" " 'NR=='$randline' {printf "%s", $1}' myfileThis works fine. The problem is there will be empty lines at the end of the file. Rather than using awkto filter out blank lines I would like to figure this out first.So I test $myvar1 for a blank string after setting $randline to one that I know is blank:test -z "$myvar1" && echo "true" || echo "false"But, this returns "false"? So the string is not zero length. Why? It's a tab-separated file. Is awk storing the tab with the $1 field or something.This is where I get headache. I try to echo my variable to see what it looks like.
echo "$myvar1" outputs: nothing echo "My variable is [$myvar1]" outputs: [y variable is [
Why is the closing bracket at the beginning? What character could be stored in $myvar1 that would do such a thing and how did it get there?
I would like to know how I can get the ouput from the following dmidecode command in example 1 to look like example 2 without having to grep -v all the unwanted lines.Is there a way in awk or sed?Example 1
Code: Processor Information Socket Designation: Socket 1 CPU 1
I am using 'sed -e /foo/d' to match lines which I want to delete from a file. I discovered I have some lines which contain random (extended?) characters like 'ủ' which I would also like to delete. The lines in the file should only contain alpha numeric characters.
I'm trying to come up with ideas for a simple way to strip a specific "entry" from a text file.I know tools like sed and perl can remove specific lines from a file but I haven't been able to come up with an elegant way to do my group of lines.In my file, the first "Location" line and the "SVNPath" line should be unique every time... but are they enough to strip out the whole set of the group plus the trailing one line of white space separating each group? Add to this, my file will grow as new entries are added (always appended to the end) but new entries will have the same formatting.
So I need to write a bash script that can read lines and column 3 from a file. It needs to start on line 16 and read every 20th line starting from there. But the value that it reads needs to be checked, should it be too great I need it to shut the program down.I'm pretty new to bash, but my ultimate goal is being able to safely run a program on a GPU for an extended period of time with out worrying about it overheating. I have a command that outputs information from the GPU every second, and I can save this to a file. So all I really need is something to read and check that file, I played around a bit with the awk command and can't get it to work within my for loop with dynamic variable.
I a csv-file (A.csv) with a total of 4.600.000 lines. Thats to many and only a few is necessary. I have a txt-file with 150 lines (X.txt) (all lines is dataset from a mainframe and looks like abc.def.123.456. How do I remove lines from A.csv where none of the dataset from x.txt is present?
For example, I have a text file with data which lists numerical values from two separate individuals
Code: Person A 100 200 300 400 500 600 700 800 900 1000 1100 1200
Person B 1200 1100 1000 900 800 700 600 500 400 300 200 100
How would I go about reading the values for each Person, then being able to perform mathematical equations for each Person (finding the sum for example)?
I would like to parse an input file in which there are two columns per each row. We want to see how many lines are duplicated where we define duplicate to be having the same second field and different first field. For instance if the input file looks like the following:
i have a table in a text file. How can i remove from that table for example "SLS= " if the value is empty? Is it possible to do it in bash awk or sed? [URL]
i am on processing text tasks And i found that if you assign a text to a variable is chomp'ed automatically the newline
Code:
variable=$(cat file.txt)
The problem is i can only access the items/lines using:
Code:
for line in $variable do echo $line # Other commands done
how do i convert this to an indexed array. More importantly, how do i get access to individual $line[0], ..., $line[n] Another thing, if the file.txt, has lines with spaces it is a mess using the for...in..., but echoing prints line by line...o_0
I have a script that calls for a file description on a core file. I then pull the name of the process that caused the core file. unfortunately, the process name is pulled with a leading ' amd a tailing'. I would like to remove the leading char and the last char.
However, the ffmpeg command generates a temporary file blahblah.mpg.tmp of about 1GB per hour of transcoded video.My issue is that I can't seem to delete these files automatically from any bash script.Now from the command line, I can cd to the directory and just rm -f *.tmp and they get deleted. However, from my script, that same command doesn't remove those files. I thought maybe the file was in use, so I put a sleep command in for like an hour before the delete happens, but it still fails. I also put rm -f /mnt/mythtv/*.tmp in a root cronjob and it still doesn't delete the files.
If I just rm *.tmp I do get a prompt about "Are you sure you want to delete this write protected file?". But the -f switch seems to work fine as a normal user from the command line and just delete them.Does anyone have an idea how to troubleshoot this problem? The particular filesystem that the tmp files get generated on is on it's own xfs partition mounted as /mnt/mythtv.
I'm looking for a script (bash, python, perl etc) or even a one liner (sed, awk etc) that can take a set of files and remove any line that has more than "x" instances of any character (case sensitive). I have been doing a lot of searching and can only come up with examples of how to remove blank lines, lines that start with a certain character or lines that contain a certain string. This will be used on a system running a Kubuntu derivative.
As a very poor and basic example, I would like to take files that contain lines like:
Code:
And end up with the files only containing the lines:
Code:
If I tell the script that 2 is the maximun number of times any character can appear in any line.
I know this must be possible, but for the life of me I cannot find even an example that will lead me in the right direction or better yet a piece of code I can use.
I want to delete all files within a specific folder without actually deleting the folder, what is a good bash command for this?. I found this one but encountered some errors even though I am executing it within the specific folder:
useratdebian:/home/user/folder# find . -type f -exec rm -rf {} ; [1] 5052 useratdebian:/home/user/folder# find: missing argument to `-exec' [1]+ Exit 1 find . -type f -exec rm -rf
The command as it appears is:
find . -type f -exec rm -rf {} ;
how to delete only the files contained within the folder called "folder" for example?
I am writing a bash script to run everyday and output results to a file. When the same results are produced i want to overwrite the line from the previous day. (Or remove and add). So if the script finds a variable in a line. i want it to output the results to that line . sed -i did not work for me; sed: couldn't open temporary file ./sedTvOCEg: Permission denied
I have a directory (Linux user) with a number of files which contain an added [!] to the end of each file name so that each file reads out as: foo something [!].zip bar something [!].zip helloworld [!].zip etc. What is the quickest way to batch rename these to remove the ending [!] character combination from these file names?
Take a look at the "screen shot attachment" you will notice that I have multiple listings for the HD - WIN_XP, WIN_7 and GAMES. I was checking out both NTFS CONFIGURATION TOOL and MOUNTMANAGER. I Removed MOUNTMANAGER but the multiple listings remain? I would like to have the ability to Mount and Unmount any drive from the File Manager except ARCHIVE this drive I need to mount as Ubuntu Boots up.
I'm running Ubuntu 9.10 dual booted with Windows XP. Ever since I installed 9.10, I get a long list of OS's (actually, multiple repeats of what appears to be the same Ubuntu install), and I can't get rid of them.I've looked through various tutorials and the GRUB 2 Community documentation (https://help.ubuntu.com/community/Grub2) but I still can't get rid of the menu items.I edited the 40_custom file and was able to add the entries I wanted to see, and then chmoded 644 all the other files in the /etc/grub.d directory and run update-grub.The custom entries do appear at the end of the menu now, but the old entries are still there as well. This is what I get when I run the update-grub: