Is there an easy way to get the number of repeating character in a word?

emremrah 11/08/2018. 6 answers, 97 views
python regex string counter

I'm trying to get how many any character repeats in a word. The repetitions must be sequential.

For example, the method with input "loooooveee" should return 6 (4 times 'o', 2 times 'e').

I'm trying to implement string level functions and I can do it this way but, is there an easy way to do this? Regex, or some other sort of things?

So far I tried this:

def measure_normalized_emphasis(text):
    char = text[-1]
    emphasis_size = 0
    for i in range(1, len(text)):
        if text[-i] == char:
            emphasis_size += 1
        else:
            char = text[i - 1]

    return emphasis_size

And it returns 8 with "loooooveee".

6 Answers


jpp 11/08/2018.

Original question: order of repetition does not matter

You can subtract the number of unique letters by the number of total letters. set applied to a string will return a unique collection of letters.

x = "loooooveee"
res = len(x) - len(set(x))  # 6

Or you can use collections.Counter, subtract 1 from each value, then sum:

from collections import Counter

c = Counter("loooooveee")

res = sum(i-1 for i in c.values())  # 6

New question: repetitions must be sequential

You can use itertools.groupby to group sequential identical characters:

from itertools import groupby

g = groupby("aooooaooaoo")

res = sum(sum(1 for _ in j) - 1 for i, j in g)  # 5

Jan 11/08/2018.

You could use a regular expression if you want:

import re

rx = re.compile(r'(\w)\1+')

repeating = sum([m.span()[1] - m.span()[0] - 1 for m in rx.finditer("loooooveee")])
print(repeating)

This correctly yields 6 and makes use of the .span() function.


The expression is

(\w)\1+

which captures a word character (one of a-zA-Z0-9_) and tries to repeat it as often as possible.
See a demo on regex101.com for the repeating pattern.


If you want to match any character (that is, not only word characters), change your expression to:

(.)\1+

See another demo on regex101.com.


vencaslac 11/08/2018.

try this:

word=input('something:')

sum = 0

chars=set(list(word)) #get the set of unique characters

for item in chars: #iterate over the set and output the count for each item
    if word.count(char)>1:
       sum+=word.count(char)
    print('{}|{}'.format(item,str(word.count(char)))

print('Total:'+str(sum))

EDIT:

added total count of repetitions


Alexis 11/08/2018.

i'm not giving you a better solution, many have done it. I'll just correct the one you gave.

def mne(text):
char = text[0]
emphasis_size = 0
for i in range(1,len(text)):
    print(i, text[i], char, emphasis_size)
    if text[i] == char:
        emphasis_size += 1
    else:
        char = text[i]

return emphasis_size

is giving me:

>>>1 o l 0
   2 o o 0
   3 o o 1
   4 o o 2
   5 o o 3
   6 v o 4
   7 e v 4
   8 e e 4
   9 e e 5
   6

Which is what you wanted. no need for going backward, no need of [i-1]. just go forward and use too indices in the list (i and i-1)


Dhruv Joshi 11/08/2018.

Since it doesn't matter where the repetition is occurring or which characters are being repeated, you can make use of the set data structure provided in Python. It will discard the duplicate occurrences of any character or an object.

Therefore, the solution would look something like this:

def measure_normalized_emphasis(text):
    return len(text) - len(set(text))

This will give you the exact result.

Also, make sure to look out for some edge cases, which you should as it is a good practice.


doctorlove 11/08/2018.

I think your code is comparing the wrong things

You start by finding the last character:

char = text[-1]

Then you compare this to itself:

for i in range(1, len(text)):
    if text[-i] == char: #<-- surely this is test[-1] to begin with?

Why not just run through the characters:

def measure_normalized_emphasis(text):
    char = text[0]
    emphasis_size = 0
    for i in range(1, len(text)):
        if text[i] == char:
            emphasis_size += 1
        else:
            char = text[i]

    return emphasis_size

This seems to work.


HighResolutionMusic.com - Download Hi-Res Songs

1 (G)I-DLE

POP/STARS flac

(G)I-DLE. 2018. Writer: Riot Music Team;Harloe.
2 The Chainsmokers

Beach House flac

The Chainsmokers. 2018. Writer: Andrew Taggart.
3 Ariana Grande

​Thank U, Next flac

Ariana Grande. 2018. Writer: Crazy Mike;Scootie;Victoria Monét;Tayla Parx;TBHits;Ariana Grande.
4 Nicki Minaj

No Candle No Light flac

Nicki Minaj. 2018. Writer: Denisia “Blu June” Andrews;Kathryn Ostenberg;Brittany "Chi" Coney;Brian Lee;TJ Routon;Tushar Apte;ZAYN;Nicki Minaj.
5 Clean Bandit

Baby flac

Clean Bandit. 2018. Writer: Jack Patterson;Kamille;Jason Evigan;Matthew Knott;Marina;Luis Fonsi.
6 Imagine Dragons

Bad Liar flac

Imagine Dragons. 2018. Writer: Jorgen Odegard;Daniel Platzman;Ben McKee;Wayne Sermon;Aja Volkman;Dan Reynolds.
7 Halsey

Without Me flac

Halsey. 2018. Writer: Halsey;Delacey;Louis Bell;Amy Allen;Justin Timberlake;Timbaland;Scott Storch.
8 BTS

Waste It On Me flac

BTS. 2018. Writer: Steve Aoki;Jeff Halavacs;Ryan Ogren;Michael Gazzo;Nate Cyphert;Sean Foreman;RM.
9 BlackPink

Kiss And Make Up flac

BlackPink. 2018. Writer: Soke;Kny Factory;Billboard;Chelcee Grimes;Teddy Park;Marc Vincent;Dua Lipa.
10 Fitz And The Tantrums

HandClap flac

Fitz And The Tantrums. 2017. Writer: Fitz And The Tantrums;Eric Frederic;Sam Hollander.
11 Backstreet Boys

Chances flac

Backstreet Boys. 2018.
12 Kelly Clarkson

Never Enough flac

Kelly Clarkson. 2018. Writer: Benj Pasek;Justin Paul.
13 Diplo

Close To Me flac

Diplo. 2018. Writer: Ellie Goulding;Savan Kotecha;Peter Svensson;Ilya;Swae Lee;Diplo.
14 Anne-Marie

Rewrite The Stars flac

Anne-Marie. 2018. Writer: Benj Pasek;Justin Paul.
15 Little Mix

Woman Like Me flac

Little Mix. 2018. Writer: Nicki Minaj;Steve Mac;Ed Sheeran;Jess Glynne.
16 Imagine Dragons

Machine flac

Imagine Dragons. 2018. Writer: Wayne Sermon;Daniel Platzman;Dan Reynolds;Ben McKee;Alex Da Kid.
17 Little Mix

The Cure flac

Little Mix. 2018.
18 Bradley Cooper

Always Remember Us This Way flac

Bradley Cooper. 2018. Writer: Lady Gaga;Dave Cobb.
19 Rita Ora

Velvet Rope flac

Rita Ora. 2018.
20 Lady Gaga

I'll Never Love Again flac

Lady Gaga. 2018. Writer: Benjamin Rice;Lady Gaga.

Related questions

Hot questions

Language

Popular Tags