javascript - looking for a better regex solution -


this question has answer here:

my input is:
<span question_number="18"> blah blah blah 1</span><span question_number="19"> blah blah blah 2</span>

and want regex match <span question_number="somenumber">xxxx</span> pattern
, desired output 1.somenumber 2.xxxx

i wrote naive solution cover
<span question_number="18"> blah blah blah 1</span>
<span question_number="19"> blah blah blah 2</span>
notice: on different lines
output : 18, blah blah blah 1 , 19,blah blah blah 2

but when input <span question_number="18"> blah blah blah 1</span><span question_number="19"> blah blah blah 2</span>
on same line

my output 18, blah blah blah 1</span><span question_number="19"> blah blah blah 2

how bypass problem?

update: regex: /\<span question_number=(?:\")*(\d*)(?:\")*>(.*)<\/span>/ig

testinput:
case1 -> 2 lines of code
<span question_number="54">often graces doorways tied ropes called</span>
<span question_number="54">often graces doorways tied ropes called <i>ristras</i>.</span>
case2 -> 1 line of code
<span question_number="54">often graces doorways tied ropes called</span><span question_number="54">often graces doorways tied ropes called <i>ristras</i>.</span>

update2:
not dom , plain text want process.

update3: problem regex solved, have question comparing proessing speed between regex or dom operation ? how implement such test ?

if isn't html (hmm?) with

<span question_number="(\d+)">(.*?)<\/span> 

see here @ regex101.

the problem original regex it's greedy. part (.*) match many characters can, making sure remaining <\/span> still can matched. finds first <span... , matches last </span>. attempt @ solution non-greedy (the ? in (.*?)), matching first </span>.


Comments

Popular posts from this blog

serialization - Convert Any type in scala to Array[Byte] and back -

matplotlib support failed in PyCharm on OSX -

python - Matplotlib: TypeError: 'AxesSubplot' object is not callable -