Matching line breaks with regular expression find and replace in dofile editor

Sarah Edgington

Join Date: Apr 2014

Posts: 284
#1

Matching line breaks with regular expression find and replace in dofile editor

09 Dec 2014, 19:32

I have some text that's split out onto multiple lines that I would like to be on one line in my dofile. Obviously I can edit this by hand, but it would be easier if I could search and replace.
I noticed that the dofile editor in Stata 13 has the option to use regular expressions during replace. However, I'm not having any luck matching the line breaks.
I can use a regular expression to insert a line break but not remove one.

For example say I start with:
this is some text that could be ~ multiple lines using the tilde as a breakpoint

If I do the following:
find what: ~
replace with: \n

I get:
this is some text that could be
multiple lines using the tilde as a breakpoint

However, this isn't reversible. Nothing happens if I do the following on the text after making it two lines:
find what: \n
replace with: ~

I've tried \n \r and \r\n. None of these seem to match to the stata line break when used in the find box, though they all work in the replace with box. Regular expressions are not my strong suit. Am I missing something obvious here?

I'm not looking for advice on what other editors might allow me to do this. I'm currently on a borrowed computer that I can't install software on. So I'm trying to figure out if this can be done with the tools I have at hand.
Tags: None
ben earnhart

Join Date: May 2014

Posts: 1027
#2

09 Dec 2014, 21:07

Well, replacing all \n in a file would generally be shooting yourself in the foot, so it prevents that. It *ought* to allow exceptions, such as ///\n, but it seems to prevent any replacement of newlines. I suppose it's generally a wise decision on the developers' part, but \n *in combination with* something else seems like a reasonable compromise. Maybe add it to the wishlist for Stata 14?
Comment
Roberto Ferrer

Join Date: Apr 2014

Posts: 449
#3

09 Dec 2014, 21:27

This works:

filefilter testtext.do testtext2.do, from(\n) to("") replace

You should:

1. Read the FAQ carefully.

2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.
Comment
Sarah Edgington

Join Date: Apr 2014

Posts: 284
#4

10 Dec 2014, 11:17

Originally posted by ben earnhart View Post

Well, replacing all \n in a file would generally be shooting yourself in the foot, so it prevents that. It *ought* to allow exceptions, such as ///\n, but it seems to prevent any replacement of newlines. I suppose it's generally a wise decision on the developers' part, but \n *in combination with* something else seems like a reasonable compromise. Maybe add it to the wishlist for Stata 14?

There are all sorts of things one can do with regular expressions that you generally would not imagine that someone would want to do. For instance \w replaces all alphanumeric characters which is generally not going to be useful on it's own but can be powerful when combined with other things. Indeed many elements of regular expressions on their own would be destructive to most files. Hobbling one of those seems odd.

In this case I did want to replace all the line breaks in the file in question since I was working with a small section of text that I wanted to convert into a single line. When writing code and documentation I often find myself removing extraneous spaces, tabs, and line breaks from things copied from Excel or elsewhere. In this case it was easily done by hand but in more complicated examples it can be nice to have more search and replace power to recognize and change more complex patterns. Usually I use a separate text editor that has regular expression support but I don't currently have access to a computer with one installed. I was excited to notice that the dofile editor had added regular expression support since I thought it might minimize my need for a separate editor. Apparently no such luck. The dofile editor does fine with the dealing with tabs but the fact that it doesn't deal with the end of lines is going to be problematic for what I need.

One thing I would love to see is better documentation of exactly what expressions are supported in Stata. There are some outside web resources that help but both the manual and the official Stata FAQ are really sparse on details of what is supported beyond the very basic syntax. Since they clearly state that they use their own parser it seems like it should be straightforward to make clear what's supported. Some more advanced examples would also be nice.

The filefilter solution works, though it requires more steps than doing something in a text editor. It's great for modifying data or modifying multiple files in a loop, but seems overly clunky for the kind of smaller ad-hoc tasks I run into while writing dofiles.
Comment
ben earnhart

Join Date: May 2014

Posts: 1027
#5

10 Dec 2014, 11:55

I know you didn't want to/can't go to a separate text processor, but does the computer you're working on have Word? It doesn't have regular expressions, but can replace "\n" in a section (or an entire document) easily enough. It represents \n (or \r\n\) as ^p.
Comment
Sarah Edgington

Join Date: Apr 2014

Posts: 284
#6

10 Dec 2014, 12:29

Interesting. Didn't even think about trying word since it's terrible for creating plain text. Even just copying and pasting into and out of Word mangles quotation marks and single quotation marks in a way that requires fixing before incorporating into a dofile (at least when doing text searching tasks). So my default processing choice for plain text is absolutely anything but word. I suppose it's worth trying to see if it might be a reasonable stop-gap until I get a new computer and can request installation of my usual tools.
Comment

Announcement

Matching line breaks with regular expression find and replace in dofile editor

Comment

Comment

Comment

Comment

Comment