METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION

Bibliographic Details
Title:	METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION
Document Number:	20120294530
Publication Date:	November 22, 2012
Appl. No:	13/522589
Application Filed:	January 20, 2011
Abstract:	Methods and apparatus for video object segmentation are provided, suitable for use in a super-resolution system. The method comprises alignment of frames of a video sequence, pixel alignment to generate initial foreground masks using a similarity metric, consensus filtering to generate an intermediate foreground mask, and refinement of the mask using spatio-temporal information from the video sequence. In various embodiments, the similarity metric is computed using a sum of squared differences approach, a correlation, or a modified normalized correlation metric. Soft thresholding of the similarity metric is also used in one embodiment of the present principles. Weighting factors are also applied to certain critical frames in the consensus filtering stage in one embodiment using the present principles.
Inventors:	Bhaskaranand, Malavika (Goleta, CA, US); Bhagavathy, Sitaram (Plainsboro, NJ, US)
Claim:	1. A method for video object segmentation, comprising: aligning one or more reference frames with a current frame containing a video object; generating a foreground mask for a current frame based on a neighborhood similarity metric; and refining the foreground mask by using information from at least one video frame or mask.
Claim:	2. The method of claim 1, further comprising: generating initial foreground masks for a current frame with respect to each aligned reference frame based on a neighborhood similarity metric; combining information from the initial foreground masks to generate a single intermediate foreground mask for the current frame before refining the intermediate foreground mask.
Claim:	3. The method of claim 1, wherein the information from at least one video frame or mask used in the refining step is some combination of spatial and temporal information.
Claim:	4. The method of claim 2 wherein the combining step is performed using a consensus filtering mechanism.
Claim:	5. The method of claim 1, wherein said aligning step uses multi-hop homography between frames.
Claim:	6. The method of claim 2, wherein the initial foreground masks are generated on a block basis.
Claim:	7. The method of claim 2, wherein said initial foreground masks are generated using a normalized correlation metric.
Claim:	8. The method of claim 2, wherein the initial foreground masks are generated using weighting factors that weigh individual frames.
Claim:	9. The method of claim 2, wherein a three-level intermediate mask is used when generating foreground masks.
Claim:	10. The method of claim 2, wherein morphological operations are used to combine information from the initial foreground masks to generate a single mask for the current frame.
Claim:	11. An apparatus for video object segmentation, comprising: a memory and frame alignment mechanism that stores a plurality of frames of video and aligns one or more reference frames with a current frame containing a video object; circuitry that generates an intermediate mask for the current frame based on a neighborhood similarity metric; and a processor that refines the intermediate mask by using information from at least one video frame or mask.
Claim:	12. The apparatus of claim 11, further comprising: circuitry that generates initial foreground masks for a current frame with respect to each aligned reference frame based on a neighborhood similarity metric; a generator that combines information from the initial foreground masks to generate an intermediate mask for the current frame before refining the intermediate foreground mask.
Claim:	13. The apparatus of claim 11, wherein the processor uses information from at least one video frame or mask that is some combination of spatial and temporal information.
Claim:	14. The apparatus of claim 12, wherein the generator combines information using a consensus filtering mechanism.
Claim:	15. The apparatus of claim 11, wherein said memory and frame alignment mechanism uses multi-hop homography between frames.
Claim:	16. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates masks on a block basis.
Claim:	17. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using a normalized correlation metric.
Claim:	18. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using weighting factors that weight individual frames.
Claim:	19. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using a three-level intermediate mask.
Claim:	20. The apparatus of claim 11, wherein said processor uses morphological operations to combine information from the foreground masks to generate a single mask for the current frame.
Current U.S. Class:	382/173
Current International Class:	06
Accession Number:	edspap.20120294530
Database:	USPTO Patent Applications

FullText	Text: Availability: 0 CustomLinks: – Url: https://ppubs.uspto.gov/pubwebapp/external.html?q=(%2220120294530%22).pn.&db=US-PGPUB&type=ids Name: EDS - USPTO Patent Applications Category: fullText Text: View record from USPTO MouseOverText: View record from USPTO – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edspap&genre=article&issn=&ISBN=&volume=&issue=&date=20121122&spage=&pages=&title=METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION&atitle=METHOD%20AND%20APPARATUS%20FOR%20VIDEO%20OBJECT%20SEGMENTATION&aulast=Bhaskaranand%2C%20Malavika&id=DOI: Name: Full Text Finder (for New FTF UI) (s8985755) Category: fullText Text: Find It @ SCU Libraries MouseOverText: Find It @ SCU Libraries
Header	DbId: edspap DbLabel: USPTO Patent Applications An: edspap.20120294530 RelevancyScore: 722 AccessLevel: 3 PubType: Patent PubTypeId: patent PreciseRelevancyScore: 722.40185546875
IllustrationInfo
Items	– Name: Title Label: Title Group: Ti Data: METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION – Name: DocumentID Label: Document Number Group: Patent Data: 20120294530 – Name: DateEntry Label: Publication Date Group: Patent Data: November 22, 2012 – Name: DocumentID Label: Appl. No Group: Patent Data: 13/522589 – Name: DateFiled Label: Application Filed Group: Patent Data: January 20, 2011 – Name: Abstract Label: Abstract Group: Ab Data: Methods and apparatus for video object segmentation are provided, suitable for use in a super-resolution system. The method comprises alignment of frames of a video sequence, pixel alignment to generate initial foreground masks using a similarity metric, consensus filtering to generate an intermediate foreground mask, and refinement of the mask using spatio-temporal information from the video sequence. In various embodiments, the similarity metric is computed using a sum of squared differences approach, a correlation, or a modified normalized correlation metric. Soft thresholding of the similarity metric is also used in one embodiment of the present principles. Weighting factors are also applied to certain critical frames in the consensus filtering stage in one embodiment using the present principles. – Name: Author Label: Inventors Group: Patent Data: <searchLink fieldCode="ZA" term="%22Bhaskaranand%2C+Malavika%22">Bhaskaranand, Malavika</searchLink> (Goleta, CA, US); <searchLink fieldCode="ZA" term="%22Bhagavathy%2C+Sitaram%22">Bhagavathy, Sitaram</searchLink> (Plainsboro, NJ, US) – Name: Comment Label: Claim Group: Patent Data: 1. A method for video object segmentation, comprising: aligning one or more reference frames with a current frame containing a video object; generating a foreground mask for a current frame based on a neighborhood similarity metric; and refining the foreground mask by using information from at least one video frame or mask. – Name: Comment Label: Claim Group: Patent Data: 2. The method of claim 1, further comprising: generating initial foreground masks for a current frame with respect to each aligned reference frame based on a neighborhood similarity metric; combining information from the initial foreground masks to generate a single intermediate foreground mask for the current frame before refining the intermediate foreground mask. – Name: Comment Label: Claim Group: Patent Data: 3. The method of claim 1, wherein the information from at least one video frame or mask used in the refining step is some combination of spatial and temporal information. – Name: Comment Label: Claim Group: Patent Data: 4. The method of claim 2 wherein the combining step is performed using a consensus filtering mechanism. – Name: Comment Label: Claim Group: Patent Data: 5. The method of claim 1, wherein said aligning step uses multi-hop homography between frames. – Name: Comment Label: Claim Group: Patent Data: 6. The method of claim 2, wherein the initial foreground masks are generated on a block basis. – Name: Comment Label: Claim Group: Patent Data: 7. The method of claim 2, wherein said initial foreground masks are generated using a normalized correlation metric. – Name: Comment Label: Claim Group: Patent Data: 8. The method of claim 2, wherein the initial foreground masks are generated using weighting factors that weigh individual frames. – Name: Comment Label: Claim Group: Patent Data: 9. The method of claim 2, wherein a three-level intermediate mask is used when generating foreground masks. – Name: Comment Label: Claim Group: Patent Data: 10. The method of claim 2, wherein morphological operations are used to combine information from the initial foreground masks to generate a single mask for the current frame. – Name: Comment Label: Claim Group: Patent Data: 11. An apparatus for video object segmentation, comprising: a memory and frame alignment mechanism that stores a plurality of frames of video and aligns one or more reference frames with a current frame containing a video object; circuitry that generates an intermediate mask for the current frame based on a neighborhood similarity metric; and a processor that refines the intermediate mask by using information from at least one video frame or mask. – Name: Comment Label: Claim Group: Patent Data: 12. The apparatus of claim 11, further comprising: circuitry that generates initial foreground masks for a current frame with respect to each aligned reference frame based on a neighborhood similarity metric; a generator that combines information from the initial foreground masks to generate an intermediate mask for the current frame before refining the intermediate foreground mask. – Name: Comment Label: Claim Group: Patent Data: 13. The apparatus of claim 11, wherein the processor uses information from at least one video frame or mask that is some combination of spatial and temporal information. – Name: Comment Label: Claim Group: Patent Data: 14. The apparatus of claim 12, wherein the generator combines information using a consensus filtering mechanism. – Name: Comment Label: Claim Group: Patent Data: 15. The apparatus of claim 11, wherein said memory and frame alignment mechanism uses multi-hop homography between frames. – Name: Comment Label: Claim Group: Patent Data: 16. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates masks on a block basis. – Name: Comment Label: Claim Group: Patent Data: 17. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using a normalized correlation metric. – Name: Comment Label: Claim Group: Patent Data: 18. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using weighting factors that weight individual frames. – Name: Comment Label: Claim Group: Patent Data: 19. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using a three-level intermediate mask. – Name: Comment Label: Claim Group: Patent Data: 20. The apparatus of claim 11, wherein said processor uses morphological operations to combine information from the foreground masks to generate a single mask for the current frame. – Name: CodeClass Label: Current U.S. Class Group: Patent Data: 382/173 – Name: CodeClass Label: Current International Class Group: Patent Data: 06 – Name: AN Label: Accession Number Group: ID Data: edspap.20120294530
PLink	https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edspap&AN=edspap.20120294530
RecordInfo	BibRecord: BibEntity: Languages: – Text: English Titles: – TitleFull: METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Bhaskaranand, Malavika – PersonEntity: Name: NameFull: Bhagavathy, Sitaram IsPartOfRelationships: – BibEntity: Dates: – D: 22 M: 11 Text: November 22, 2012 Type: published Y: 2012
ResultId	1