How to preserve original indexing of string elements after using regex(如何在使用正则表达式后保留字符串元素的原始索引)

I’m trying to remove all letters and numbers from a string. For this I use the following regex methods:

txt = read_file()

pattern = r'[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwzyx0123456789]'
mod_string = re.sub(pattern, '', txt)
new_string = re.sub(r'\s+', '', mod_string)
for i in new_string:
    print(new_string.index(i), i)

The first use of the re.sub() method is for removing the letters and numbers, and the second is for removing the white spaces that remain.
However, when I want to access the indexes of the elements of the resulting string, it outputs the following (this is just a snippet of the real output, as that is much larger):

0 .
91 ]
92 <
92 <
57 -
1 ,
0 .
1 ,
0 .
99 (
100 )
1 ,
99 (
100 )
0 .
0 .
106 >
106 >
0 .
0 .
0 .
0 .
99 (
100 )

How can I achieve the same, but have a ‘normal’ indexing?
For example if I have a string with “>)(.>”, if i want to get the indexes, it want 0: >, 1: ), 2: (, 3: ., 4: > this is so that i can work with it easier after. I want to convert the string to a list of individual strings, which is why i need the indexes to be like this.

Solution:

What about this?

import re

txt = "ab>c)def(0123456    .> 789"

pat = re.compile(r'[a-zA-Z0-9 ]')
new_string = pat.sub('', txt) # new_string is ">)(.>" after pat.sub operation.

for i, s in enumerate(new_string): # Just show a pair of index and character in new_string
    print(i, s)
————————

我正在尝试删除字符串中的所有字母和数字。为此,我使用以下正则表达式方法:

txt = read_file()

pattern = r'[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwzyx0123456789]'
mod_string = re.sub(pattern, '', txt)
new_string = re.sub(r'\s+', '', mod_string)
for i in new_string:
    print(new_string.index(i), i)

第一次使用re。sub()方法用于删除字母和数字,第二种方法用于删除保留的空白。
然而,当我想要访问结果字符串元素的索引时,它会输出以下内容(这只是实际输出的一个片段,因为它要大得多):

0 .
91 ]
92 <
92 <
57 -
1 ,
0 .
1 ,
0 .
99 (
100 )
1 ,
99 (
100 )
0 .
0 .
106 >
106 >
0 .
0 .
0 .
0 .
99 (
100 )

我如何实现同样的目标,但却有一个“正常”的索引?
例如,如果我有一个带“>)(.>”的字符串,如果我想获取索引,它需要0:>,1:),2:(,3:,4:>,这样我以后可以更轻松地使用它。我想将字符串转换为单个字符串的列表,这就是为什么我需要索引是这样的。

解决方法:

这个呢?

import re

txt = "ab>c)def(0123456    .> 789"

pat = re.compile(r'[a-zA-Z0-9 ]')
new_string = pat.sub('', txt) # new_string is ">)(.>" after pat.sub operation.

for i, s in enumerate(new_string): # Just show a pair of index and character in new_string
    print(i, s)