General :: Subset A Large Dataset By Specifing The Starting & End Line?

Aug 27, 2010

This is my first time on this forum. I am a statistician. I am trying to subset a large dataset by specifing the starting & end line. The dataset is pretty large (more than 300 million lines), containing around 1.2 million lines for a person. So I would like to split the dataset into per person consecutively. I tried wrap r codes, but R seems to have to read from top to where I want although I specified that it should skip the lines that other tasks have read. So the memory is increasing with the task ID. Finally I got kicked out by the administer.

I guess that shell may do it much simple and elegently. First I thought of "split" command. But the the file has a header of 10 lines. So I can't split it into even size chuncks.

View 5 Replies


ADVERTISEMENT

Programming :: How To Extract A Subset From A Huge Dataset

Mar 13, 2010

I have a huge file which has 450G. Its format is as below

x1 50020 A 1
x1 50021 B 8
x1 50022 C 9

[code]....

Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is from 600000 to 30000000. I wrote the following perl script but it doesn't work:

#!/usr/bin/perl
$file1 = $ARGV[0]; # Input file
$file2 = $ARGV[1]; # Output file

[code]...

I guess the input file and output file are both too big that my script can't handle it.

View 11 Replies View Related

Ubuntu :: Rsync Vackup To Remote Server Of Large Dataset

Jun 18, 2010

I have cygwin on Windows XP running rsync to remote Ubuntu server over ssh using ADSL.My data set is about 20Gb! But, Cygwin will backup incrementally, so after the first backup the process should be relatively quick.With ADSL the first backups will take too long. I was thinking about doing the first backup by copying files to an external hard drive then attaching the hard drive to my remote server and copying the files. The idea being that rsync will pick up the files as if it had created them in the first instance. The incremental backups will then pickup from there.

Does anyone have any experience with this and/or can provide any advice? The external hd is fat-32 which is okay with Windows and should be okay with Ubuntu? From XP right click copy and then paste keeps the file dates intact on the external hd - is this enough to get rsync going incrementally?

View 1 Replies View Related

General :: Most Efficient Way Of Taking Subset Of Lines

Feb 9, 2010

I have two files. One huge one (200.000+ lines) called 'db' and one big one (15.000+ lines) called 'indices'.What is the quickest way of filtering out the lines in 'db' containing any index (anywhere on the line) from 'indices'.Is there a faster approach in bash, linux?

View 1 Replies View Related

General :: Lost Command Line Prompt After Starting Totem?

Feb 11, 2010

At the Command Line Prompt I am able to start some Applications (such as openoffice.org or evolution) and the command line prompt re-appears after program is launched and I can continue working in that Terminal. However, other Applications, such as Totem or Blackboard will launch from the Terminal but the Prompt does not re-appear. Where Totem is concerned I get a message stating "sha module is deprecated use hashlib module instead". Where Blackboard is concerned the command line does not reappear. I have to use Ctrl + C to get the command line back but this closes the application as well! Or, I have to open a new Terminal. why some applications will start from the command line and others do not? How do you get the prompt back (other than q or Ctrl + c) thanks to all and kindest regards ( I am using Ubuntu 9.04)

View 3 Replies View Related

General :: Starting A Command Line Interpreter When Using Gnome Desktop?

Mar 30, 2010

I'm studying Information Technology and doing Linux as part of it. One of the questions in my text book is: Describe three different ways to start a command line interpreter when using the Gnome desktop of openSUSE Linux. I can't for the life of me make sense out of it.

View 8 Replies View Related

General :: When Starting Xampp - Get -Warning Bogus Unix Line

May 3, 2010

Not sure why, but the last couple of tiems I have started xampp from root terminal, I have got this message after each program start.

Warning a bogus unix line

Since that last time it was not there, other that add sweb spahes etc., I have only tried to unpack the Control Panel which was unsuccesfull anyway saying I needed some other programs.

View 1 Replies View Related

Software :: Filter A Large Document By Line Number?

Feb 17, 2010

I have a 50000 line(ish) set of records in a file. I have another file where I have filtered out all the line numbers for those which have an error of various types. e.g column count, field type etc. I want to get all those lines into a separate file so I can sanitise them. There are abt 3-4000 of them.

How can I access those lines which I want to isolate into a single file? I have all the usual linux stuff available and a bit of understanding of regexps.

View 5 Replies View Related

Ubuntu :: Shell - Run Command For Subset Of Files In Directory?

Jan 28, 2011

I cannot find the way to run some command for a subset of files in directory - how can I do it

View 3 Replies View Related

OpenSUSE :: Starting Applications From Command Line?

Feb 20, 2011

I am a redhat admin and also use Ubuntu. Installed opensuse on my home machine to give it a whirl. I can't seem to figure out why i can't open gui application from the command line.

I receive a GTK error when trying to open with sudo. What am i doing wrong?

EDIT: NM solved my own question, had to add DISPLAY and XAUTHORITY to the sudoers file.

View 5 Replies View Related

Fedora Networking :: Starting DHCP On Command Line?

Sep 25, 2009

I need to start DHCP after booting into run level 1.

So i am going to ....

ifconfig eth1 up

what is the command to start DHCP service?

View 7 Replies View Related

Fedora Hardware :: Remove Via The Command Line Prior To Actually Starting?

Feb 17, 2010

Yesterday i finally got around to installing my graphics card (NVIDIA GeForce 8400M CS) on fedora 12 by using the command yum install kmod-nvidia the terminal then told me that it installed correctly so i rebooted my system. Now when i boot up into fedora, it loads and when the sign in window is about to appear instead my screen shows random colors all over the place. I am hoping someone can tell me how to remove this via the command line prior to actually starting fedora.

View 2 Replies View Related

General :: Error While Starting Apache2 / Syntax Error On Line 113 Of /etc/apache2/httpd.conf?

Nov 19, 2010

I have suse10 64bit and I was setting up SVN server on it. After all required setup while reloading apache2,its giving the error:

Code:

httpd2-prefork: Syntax error on line 113 of /etc/apache2/httpd.conf: Syntax error on line 31 of /etc/apache2/sysconfig.d/loadmodule.conf: Cannot load /usr/lib64/apache2/mod_dav_svn.so into server: /usr/lib64/libsvn_subr-1.so.0: undefined symbol: apr_memcache_add_server

View 6 Replies View Related

Programming :: Fortran - Automatically Find The Starting And Ending Line Of Desired Variable?

Jun 23, 2011

I am loading the file in Fortran. File looks something like this (shown below) I am interested in Velocity values and not Pressure values. Is there a way to code in Fortran which finds the staring LINE of Velocity values and ending LINE of values or I have to manually find the lines? IN this case it should return Starting line : 9 Ending line: 11

PHP Code:

[code]....

View 2 Replies View Related

Ubuntu :: Fonts Are Just Too Large And Large

Feb 18, 2011

how big and widespaced the fonts on Clementine playlist are and how good they look on the appmenu (where my mouse pointer is). This is not because Clementine is QT4, I've got the same problem with Chrome, Opera etc. I've been messing with system-settings (KDE settings tool) a day before the fonts become that widespaced in order to make my KDE apps look more native on my GNOME, but I haven't touched the fonts settings there.

View 9 Replies View Related

General :: Create Cron Tab When DSL Line Down Set Automatically Restart The Network Service While DSL Line Up?

Oct 7, 2010

How to create cron tab when DSL line down set automatically restart the network service while DSL line up?

View 3 Replies View Related

General :: Getting The Line Of String From Previous Pipe Output By Line Number?

Feb 8, 2010

After running the following command, I get:

[root@yukiko /]# find / -iname .bashrc
/home/clamav/.bashrc
/home/vpopmail/.bashrc
/etc/skel/.bashrc
/root/.bashrc

But I would like to have a command that prints a specific line by supplying the command with the line number, for example:

[root@yukiko /]# find / -iname .bashrc | getline(2)
/home/vpopmail/.bashrc

Is there such a command on CentOS?

View 3 Replies View Related

General :: Script To Count # Of Chars Per Line - If Line Meets Certain Criteria - And Get Avg #?

Sep 11, 2009

I have several files with many lines something like this:

I'm trying to write a script that will count the number of characters per line that doesn't contain a ">" symbol and give me an average of those values. I have most of the script together but I can't figure out how to connect some of the steps.

Code:

View 3 Replies View Related

General :: Scripting - Feed An Input File Into An If Statement Line-by-line

Dec 23, 2009

I am trying to write a script that takes an input file ($FileName) and an intermediate file ($FileName.info) and removes lines from $FileName if the value in $2 of $FileName.info is <75.

I can't figure out how to feed only one line of the .info file to the if statement at a time so that it will perceive it as an integer instead of a list.

The error I am getting now is ./script.sh: line 6: [: : integer expression expected

Sample input $FileName

Code:

Code:

Code:

Script so far:

Code:

View 10 Replies View Related

General :: Parse A File And Print Each Line That Ends With Matching Pattern (if The Next Line Is Blank)

Aug 2, 2010

I've written a script to parse a file and print each line that ends with matching pattern, if the next line is blank. The pattern lines are the result of md5sum $i|sed 's/path///g' so that only md5 and filename appear. Here's what I'm using.

Quote:
for fline in `sed -n '/.*.ext$/p' file1`
do
if [ "`sed -n -e '/'"$fline"'/ {n; p;}' file1`" == "" ]
then
echo ""$fline" has no info" >>file2
fi
done
[Code]....

View 4 Replies View Related

General :: Command Line Way To View A Line Of A File With Context?

Feb 24, 2011

I'd like show a certain line or lines of a file with context, kind of like a unified diff, on the command line in Linux:

$ (something) -l 154 stuff.py
150: def foo(bar):
151: """

[code]....

View 5 Replies View Related

General :: Printing Command Line History Without Line Numbers?

Aug 22, 2011

How can I print Linux command line history without including the line numbers? I want to send it all to a text file like this:history >> history.txt

View 1 Replies View Related

General :: Sort By Line Size (number Of Characters In A Line)?

Jan 8, 2010

I want to sort a number of lines based on their size:

data:
-------
12345678
87654321
1234

[code]....

Should output as:
-----------------
1
2
12
21

[code]....

But i'm gettings this with sort
----------------
1
12
123
1234

[code]....

Can we sort the above "data" text, based on "number of characters" instead of "character order"?

View 8 Replies View Related

General :: Appending To The Current Line In A File Instead Of Creating A New Line?

Apr 1, 2011

I am combining data from a couple different input files and creating an output file in a specific format. I notice that if I use the >> operator, information gets appended to a new line in my output file. This is useful, but if I'd like to append onto the CURRENT line, is there an easy way to do this? I've been googling around and see lots of complicated answers, nothing that suggests to me an easy way to do this. For example, if my output file looks like this:

b1a:] cat test
hello my name is
b1a:]

and I'd simply like to append "Bob", how can I do it? If I use

b1a:] echo Bob >> test
b1a:] cat test
b1a:] hello my name is
Bob
b1a:]

So what I would prefer is some command that would create the result:

hello my name is Bob

View 14 Replies View Related

General :: Match And Combine 2 Text Files Line By Line

Mar 21, 2011

This solution works but is slow with large files. I am looking for a faster solution.

The 2 files contain filenames, one of them has associated data I want to append to the other file's matching filenames

file1:

file2:

I append file2 by matching the unique_filenames and appending them with the tag data and some formatting

appended file2:


Here is the SLOW code

while read inputline.

View 9 Replies View Related

General :: Sed To Display The Pattern String - The Line Above It And The First Line Of That Para

Mar 30, 2011

I need to grep for a particular string and if found need to display the line containing that string, the line above that and also the first line of that paragraph.

Can this be done via sed.

Eg, My Paragraphs

OA connectA

Enclosure:

Interconnect Module #6 Status:

Here, if I grep for Critical, it should display the following

Similarly if I grep for Degraded, it should display

View 3 Replies View Related

General :: LQ ISO Too Large For Cd ?

Jun 6, 2010

Down loaded the Ubuntu 10.4 ISO for this site when download was complete, got a screen telling me to insert a writable cd which I did. It went through a Format process and then asked me to drag the files to that directory. When I tried to do that I got a message saying that I was 138mb short of space. the Iso was 704mb and the cd had formatted to something over 500mb. the disk is a CD-RW rewritable cd.

View 5 Replies View Related

General :: Script Which Read Line By Line /passwd Using While?

May 13, 2010

I have to do several scripts and I have no idea of how to do this one: Make a script that read line by line the passwd file and prints in console.Hope you understand couse my english isso bad as you can see.Our teacher told us something like this:#!/bin/bashwhile read line doecho $lineadone < dispositiveexit

View 9 Replies View Related

General :: Grep Lines Containing A Certain String PLUS The Line Following That Line?

Sep 1, 2009

I have a dataset (see example below) that I would like to go through and copy all lines containing a certain string ("LGIG") plus the line immediately following that line to a new file. I have no problem grepping lines containing the string LGIG but I'm lost how to translate that to line number and shift up one line number for each instance of that string.

Example input file:

[code].....

View 5 Replies View Related

General :: Replace Line With Another Line In Shell Script?

Feb 26, 2010

I have the file abc.txt

cat abc.txt This is a test file Nothing is new in this world

I want to replace "This is a test file" to "Text is replaced"

Code:
FindString='This is a test file'
ReplaceString='Text is replaced'
Findarray=(`echo $FindString | tr ' ' ' '`)

[Code]....

But this is not effective. how to replace entire line either using sed or awk or any other utility.

View 5 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved