# Is there an easy way to get the number of repeating character in a word?

emremrah 11/08/2018. 6 answers, 97 views

I'm trying to get how many any character repeats in a word. The repetitions must be sequential.

For example, the method with input "loooooveee" should return 6 (4 times 'o', 2 times 'e').

I'm trying to implement string level functions and I can do it this way but, is there an easy way to do this? Regex, or some other sort of things?

So far I tried this:

def measure_normalized_emphasis(text):
char = text[-1]
emphasis_size = 0
for i in range(1, len(text)):
if text[-i] == char:
emphasis_size += 1
else:
char = text[i - 1]

return emphasis_size

And it returns 8 with "loooooveee".

jpp 11/08/2018.

### Original question: order of repetition does not matter

You can subtract the number of unique letters by the number of total letters. set applied to a string will return a unique collection of letters.

x = "loooooveee"
res = len(x) - len(set(x))  # 6

Or you can use collections.Counter, subtract 1 from each value, then sum:

from collections import Counter

c = Counter("loooooveee")

res = sum(i-1 for i in c.values())  # 6

### New question: repetitions must be sequential

You can use itertools.groupby to group sequential identical characters:

from itertools import groupby

g = groupby("aooooaooaoo")

res = sum(sum(1 for _ in j) - 1 for i, j in g)  # 5

Jan 11/08/2018.

You could use a regular expression if you want:

import re

rx = re.compile(r'(\w)\1+')

repeating = sum([m.span()[1] - m.span()[0] - 1 for m in rx.finditer("loooooveee")])
print(repeating)

This correctly yields 6 and makes use of the .span() function.

The expression is

(\w)\1+

which captures a word character (one of a-zA-Z0-9_) and tries to repeat it as often as possible.
See a demo on regex101.com for the repeating pattern.

If you want to match any character (that is, not only word characters), change your expression to:

(.)\1+

vencaslac 11/08/2018.

try this:

word=input('something:')

sum = 0

chars=set(list(word)) #get the set of unique characters

for item in chars: #iterate over the set and output the count for each item
if word.count(char)>1:
sum+=word.count(char)
print('{}|{}'.format(item,str(word.count(char)))

print('Total:'+str(sum))

EDIT:

added total count of repetitions

Alexis 11/08/2018.

i'm not giving you a better solution, many have done it. I'll just correct the one you gave.

def mne(text):
char = text[0]
emphasis_size = 0
for i in range(1,len(text)):
print(i, text[i], char, emphasis_size)
if text[i] == char:
emphasis_size += 1
else:
char = text[i]

return emphasis_size

is giving me:

>>>1 o l 0
2 o o 0
3 o o 1
4 o o 2
5 o o 3
6 v o 4
7 e v 4
8 e e 4
9 e e 5
6

Which is what you wanted. no need for going backward, no need of [i-1]. just go forward and use too indices in the list (i and i-1)

Dhruv Joshi 11/08/2018.

Since it doesn't matter where the repetition is occurring or which characters are being repeated, you can make use of the set data structure provided in Python. It will discard the duplicate occurrences of any character or an object.

Therefore, the solution would look something like this:

def measure_normalized_emphasis(text):
return len(text) - len(set(text))

This will give you the exact result.

Also, make sure to look out for some edge cases, which you should as it is a good practice.

doctorlove 11/08/2018.

I think your code is comparing the wrong things

You start by finding the last character:

char = text[-1]

Then you compare this to itself:

for i in range(1, len(text)):
if text[-i] == char: #<-- surely this is test[-1] to begin with?

Why not just run through the characters:

def measure_normalized_emphasis(text):
char = text[0]
emphasis_size = 0
for i in range(1, len(text)):
if text[i] == char:
emphasis_size += 1
else:
char = text[i]

return emphasis_size

This seems to work.