In light of the high frequency of disyllabic words in modern Chinese and the “default” phonological status held by the disyllabic prosodic foot in speech production, we conducted a series of psycholinguistic experiments to determine whether the quantitative property of prosodic feet has a significant influence when parsing syntactically ambiguous utterances in speech perception. More specifically, without contextual or acoustic cues, would native Mandarin speakers be biased towards disyllabic structures when listening to six-syllable utterances that can be parsed into syntactically similar structures composed entirely of disyllabic or trisyllabic feet? Results indicate that, Mandarin speakers tend to parse ambiguous utterances initially into one of the two possible syntactic structures rather than simply recognizing them as ambiguous. Nonetheless, they do not favour disyllabic structures when lexical information regarding word meaning, syntactic function, usage frequency, etc. is available. However, when parsing sequences of six random digits where lexical and syntactic information is irrelevant, our results point to a clear preferential tendency towards disyllabic grouping. In other words, the quantitative property of prosodic feet plays a significant role in Mandarin speech parsing only when lexical and syntactic information is irrelevant or unavailable.