regex question: replace
Hi,
I'm getting into java.util.regex lately. Having used Perl for regex I'm trying to get familiar with Java's regex "spirit".
Concerning replacement we can use replaceAll or replaceFirst however:
- what if I want to replace only the third or fourth element?
- what if I want to replace second to fourth element?
in PERL we use " regex_epression_here for 2..4;" for instance.
I you would have some interesting website/tutorials related to JAVA regex that would be great.
Thanks for your help.
Rgds,
SR
A good site on regular expressions iswww.regular-expressions.infoKaj
Thanks for the reply however this site imply to pay for knowledge...
> Thanks for the reply however this site imply to pay> for knowledge...You can buy some books from the site, but it has loads of tutorials and code examples for free. http://www.google.com/search?q=java+regex+tutorial
Indeed however my question is not answered in this site. I've trought many sites and tutorials and each time only replaceFirst and replaceAll are presented, and nothing about replacing matches other than ALL or FIRST... (like third, or fith to seventh...)
you will have to implement your own replace method.
something like this
public String filteredReplace(String source, String pattern, ReplaceFilter f){
Pattern p = Pattern.compile(pattern);
StringBuffer sb = new StringBuffer();
Matcher m = p.matcher(source);
int lastEnd = 0;
while (m.find()){
sb.append(source.substring(lastEnd,m.start()-1));
if (f.shouldReplace(m.group())){
sb.append(f.getReplacement(m.group()));
}else{
sb.append(m.group());
}
lastEnd = m.end();
}
return sb.toString();
}
public interface ReplaceFilter {
public boolean shouldReplace(String match);
public String getReplacement(String match);
}
//NOTE: Has not been tested
LRMKa at 2007-7-14 0:59:26 >

oki!I thought that I could use a builtin method or something shorter to do such a simple task (simplein PERL however...).Thanks for the code, I'll adapt and test it and correct it if needed - then I'll give you feedback. Regards,SR
> oki!
>
> I thought that I could use a builtin method or
> something shorter to do such a simple task (simplein
> PERL however...).
>
> Thanks for the code, I'll adapt and test it and
> correct it if needed - then I'll give you feedback.
>
> Regards,
>
> SR
You might find http://elliotth.blogspot.com/2004/07/java-implementation-of-rubys-gsub.html
or a small variation on it useful.
I've been away from Perl for awhile--could you remind me what the simple idiom for replacing the second through the fourth matches is? I know extracting them is easy enough, but as far as I can remember, selectively replacing them would take just about as much work as is does in Java.
Yep,here is a sample of replacement in Perl$Line =~ s/\]/|/ for 2..4;#Replace 2nd 'til 4th delimiter (]) with pipe (|) ....
> Yep,
>
> here is a sample of replacement in Perl
>
> $Line =~ s/\]/|/ for 2..4;#Replace 2nd 'til
> 4th delimiter (]) with pipe (|)
>
> ....
This must be an addition to Perl regex since I last looked at it! What version of Perl?
> Yep,
>
> here is a sample of replacement in Perl
>
> $Line =~ s/\]/|/ for 2..4;#Replace 2nd 'til
> 4th delimiter (]) with pipe (|)
>
> ....
Based on the reference I gave earlier
import java.util.regex.*;
/**
* A rewriter does a global substitution in the strings passed to its
* 'rewrite' method. It uses the pattern supplied to its constructor,
* and is like 'String.replaceAll' except for the fact that its
* replacement strings are generated by invoking a method you write,
* rather than from another string.
*
* This class is supposed to be equivalent to Ruby's 'gsub' when given
* a block. This is the nicest syntax I've managed to come up with in
* Java so far. It's not too bad, and might actually be preferable if
* you want to do the same rewriting to a number of strings in the same
* method or class.
*
* See the example 'main' for a sample of how to use this class.
*
* @author Elliott Hughes
*/
public abstract class Rewriter_1
{
private Pattern pattern;
private Matcher matcher;
/**
* Constructs a rewriter using the given regular expression;
* the syntax is the same as for 'Pattern.compile'.
*/
public Rewriter_1(String regularExpression)
{
this.pattern = Pattern.compile(regularExpression);
}
/**
* Returns the input subsequence captured by the given group
* during the previous match operation.
*/
public String group(int i)
{
return matcher.group(i);
}
/**
* Overridden to compute a replacement for each match. Use
* the method 'group' to access the captured groups.
*/
public abstract String replacement(int index);
/**
* Returns the result of rewriting 'original' by invoking
* the method 'replacement' for each match of the regular
* expression supplied to the constructor.
*/
public String rewrite(CharSequence original)
{
this.matcher = pattern.matcher(original);
StringBuffer result = new StringBuffer(original.length());
int index = 0;
while (matcher.find())
{
matcher.appendReplacement(result, replacement(++index));
}
matcher.appendTail(result);
return result.toString();
}
public static void main(String[] arguments)
{
String result = new Rewriter_1("\\|")
{
public String replacement(int index)
{
if ((index >= 3) && (index <=5))
{
return "y";
}
else
{
return group(0);
}
}
}.rewrite("| | | | | |");
System.out.println(result);
}
}
>> $Line =~ s/\]/|/ for 2..4; #Replace 2nd 'til 4th delimiter (]) with pipe (|)
> This must be an addition to Perl regex since I
> last looked at it! What version of Perl?
I was expecting something using the /g flag, but this looks like it does a single replace each time through the for loop. I don't see how it's supposed to know to start at match #2, though. On my machine (Perl 5.8.6), it just replaces the first three matches.
perl -e "$txt = 'aaaaa'; $txt =~ s/a/z/ for (2..4); print $txt;"
> >> $Line =~ s/\]/|/ for 2..4; #Replace 2nd 'til
> 4th delimiter (]) with pipe (|)
>
> > This must be an addition to Perl regex since I
> > last looked at it! What version of Perl?
>
> I was expecting something using the /g flag, but this
> looks like it does a single replace each time through
> the for loop. I don't see how it's supposed
> to know to start at match #2, though. On my machine
> (Perl 5.8.6), it just replaces the first three
> matches.
>
> perl -e "$txt = 'aaaaa'; $txt =~ s/a/z/ for
> (2..4); print $txt;"
I get the same but since I also have v5.8.6 this is not a suprise!
Also, the Java code I posted as an adaption of Elliott Hughes's class is overcomplicated. One can uses Elliot's class directly e.g.
public abstract class TestRewriter
{
public static void main(String[] arguments)
{
String result = new Rewriter("\\|")
{
int index = 0;
public String replacement()
{
return ((++index >= 3) && (index <=5))? "x" : group(0);
}
}.rewrite("| | | | | |");
System.out.println(result);
}
}
> Also, the Java code I posted as an adaption of
> Elliott Hughes's class is overcomplicated. One
> can uses Elliot's class directly e.g
There's still one problem: a Rewriter is supposed to be reusable, but you never reset the index. Here's what I came up with: public static void main(String[] args)
{
String result =
new Rewriter("a")
{
int counter;
public String rewrite(CharSequence orig)
{
counter = 0;
return super.rewrite(original);
}
public String replacement()
{
String repl = counter >= 2 && counter <= 4 ? "z" : group(0);
counter++;
return repl;
}
}.rewrite("aaaaaa");
System.out.println(result);
}
If I were going to do a lot of this kind of thing, I would subclass Rewriter so I could pass the start and end indices in via the rewrite() method.
> > Also, the Java code I posted as an adaption of
> > Elliott Hughes's class is overcomplicated. One
> > can uses Elliot's class directly e.g
>
> There's still one problem: a Rewriter is supposed to
> be reusable, but you never reset the index. Here's
> what I came up with:[code] public static void
> main(String[] args)
>{
>String result =
>new Rewriter("a")
>{
>int counter;
>public String rewrite(CharSequence orig)
>{
>counter = 0;
>return super.rewrite(original);
>}
>public String replacement()
>{
> String repl = counter >= 2 && counter <= 4 ?
> "z" : group(0);
> counter++;
> return repl;
>}
> .rewrite("aaaaaa");
>System.out.println(result);
> /code]If I were going to do a lot of this kind of
> thing, I would subclass Rewriter so I could pass the
> start and end indices in via the rewrite() method.
Yes, yours is a more general approach than mine. I was just trying to show the basics. I had come up with the same solution for resetting the counter but I had not taken the extra step of defining a range to replace.
