METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION

Bibliographic Details
Title: METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION
Document Number: 20120294530
Publication Date: November 22, 2012
Appl. No: 13/522589
Application Filed: January 20, 2011
Abstract: Methods and apparatus for video object segmentation are provided, suitable for use in a super-resolution system. The method comprises alignment of frames of a video sequence, pixel alignment to generate initial foreground masks using a similarity metric, consensus filtering to generate an intermediate foreground mask, and refinement of the mask using spatio-temporal information from the video sequence. In various embodiments, the similarity metric is computed using a sum of squared differences approach, a correlation, or a modified normalized correlation metric. Soft thresholding of the similarity metric is also used in one embodiment of the present principles. Weighting factors are also applied to certain critical frames in the consensus filtering stage in one embodiment using the present principles.
Inventors: Bhaskaranand, Malavika (Goleta, CA, US); Bhagavathy, Sitaram (Plainsboro, NJ, US)
Claim: 1. A method for video object segmentation, comprising: aligning one or more reference frames with a current frame containing a video object; generating a foreground mask for a current frame based on a neighborhood similarity metric; and refining the foreground mask by using information from at least one video frame or mask.
Claim: 2. The method of claim 1, further comprising: generating initial foreground masks for a current frame with respect to each aligned reference frame based on a neighborhood similarity metric; combining information from the initial foreground masks to generate a single intermediate foreground mask for the current frame before refining the intermediate foreground mask.
Claim: 3. The method of claim 1, wherein the information from at least one video frame or mask used in the refining step is some combination of spatial and temporal information.
Claim: 4. The method of claim 2 wherein the combining step is performed using a consensus filtering mechanism.
Claim: 5. The method of claim 1, wherein said aligning step uses multi-hop homography between frames.
Claim: 6. The method of claim 2, wherein the initial foreground masks are generated on a block basis.
Claim: 7. The method of claim 2, wherein said initial foreground masks are generated using a normalized correlation metric.
Claim: 8. The method of claim 2, wherein the initial foreground masks are generated using weighting factors that weigh individual frames.
Claim: 9. The method of claim 2, wherein a three-level intermediate mask is used when generating foreground masks.
Claim: 10. The method of claim 2, wherein morphological operations are used to combine information from the initial foreground masks to generate a single mask for the current frame.
Claim: 11. An apparatus for video object segmentation, comprising: a memory and frame alignment mechanism that stores a plurality of frames of video and aligns one or more reference frames with a current frame containing a video object; circuitry that generates an intermediate mask for the current frame based on a neighborhood similarity metric; and a processor that refines the intermediate mask by using information from at least one video frame or mask.
Claim: 12. The apparatus of claim 11, further comprising: circuitry that generates initial foreground masks for a current frame with respect to each aligned reference frame based on a neighborhood similarity metric; a generator that combines information from the initial foreground masks to generate an intermediate mask for the current frame before refining the intermediate foreground mask.
Claim: 13. The apparatus of claim 11, wherein the processor uses information from at least one video frame or mask that is some combination of spatial and temporal information.
Claim: 14. The apparatus of claim 12, wherein the generator combines information using a consensus filtering mechanism.
Claim: 15. The apparatus of claim 11, wherein said memory and frame alignment mechanism uses multi-hop homography between frames.
Claim: 16. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates masks on a block basis.
Claim: 17. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using a normalized correlation metric.
Claim: 18. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using weighting factors that weight individual frames.
Claim: 19. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using a three-level intermediate mask.
Claim: 20. The apparatus of claim 11, wherein said processor uses morphological operations to combine information from the foreground masks to generate a single mask for the current frame.
Current U.S. Class: 382/173
Current International Class: 06
Accession Number: edspap.20120294530
Database: USPTO Patent Applications
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://ppubs.uspto.gov/pubwebapp/external.html?q=(%2220120294530%22).pn.&db=US-PGPUB&type=ids
    Name: EDS - USPTO Patent Applications
    Category: fullText
    Text: View record from USPTO
    MouseOverText: View record from USPTO
  – Url: https://resolver.ebsco.com/c/xy5jbn/result?sid=EBSCO:edspap&genre=article&issn=&ISBN=&volume=&issue=&date=20121122&spage=&pages=&title=METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION&atitle=METHOD%20AND%20APPARATUS%20FOR%20VIDEO%20OBJECT%20SEGMENTATION&aulast=Bhaskaranand%2C%20Malavika&id=DOI:
    Name: Full Text Finder (for New FTF UI) (s8985755)
    Category: fullText
    Text: Find It @ SCU Libraries
    MouseOverText: Find It @ SCU Libraries
Header DbId: edspap
DbLabel: USPTO Patent Applications
An: edspap.20120294530
RelevancyScore: 722
AccessLevel: 3
PubType: Patent
PubTypeId: patent
PreciseRelevancyScore: 722.40185546875
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION
– Name: DocumentID
  Label: Document Number
  Group: Patent
  Data: 20120294530
– Name: DateEntry
  Label: Publication Date
  Group: Patent
  Data: November 22, 2012
– Name: DocumentID
  Label: Appl. No
  Group: Patent
  Data: 13/522589
– Name: DateFiled
  Label: Application Filed
  Group: Patent
  Data: January 20, 2011
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: Methods and apparatus for video object segmentation are provided, suitable for use in a super-resolution system. The method comprises alignment of frames of a video sequence, pixel alignment to generate initial foreground masks using a similarity metric, consensus filtering to generate an intermediate foreground mask, and refinement of the mask using spatio-temporal information from the video sequence. In various embodiments, the similarity metric is computed using a sum of squared differences approach, a correlation, or a modified normalized correlation metric. Soft thresholding of the similarity metric is also used in one embodiment of the present principles. Weighting factors are also applied to certain critical frames in the consensus filtering stage in one embodiment using the present principles.
– Name: Author
  Label: Inventors
  Group: Patent
  Data: <searchLink fieldCode="ZA" term="%22Bhaskaranand%2C+Malavika%22">Bhaskaranand, Malavika</searchLink> (Goleta, CA, US); <searchLink fieldCode="ZA" term="%22Bhagavathy%2C+Sitaram%22">Bhagavathy, Sitaram</searchLink> (Plainsboro, NJ, US)
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 1. A method for video object segmentation, comprising: aligning one or more reference frames with a current frame containing a video object; generating a foreground mask for a current frame based on a neighborhood similarity metric; and refining the foreground mask by using information from at least one video frame or mask.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 2. The method of claim 1, further comprising: generating initial foreground masks for a current frame with respect to each aligned reference frame based on a neighborhood similarity metric; combining information from the initial foreground masks to generate a single intermediate foreground mask for the current frame before refining the intermediate foreground mask.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 3. The method of claim 1, wherein the information from at least one video frame or mask used in the refining step is some combination of spatial and temporal information.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 4. The method of claim 2 wherein the combining step is performed using a consensus filtering mechanism.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 5. The method of claim 1, wherein said aligning step uses multi-hop homography between frames.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 6. The method of claim 2, wherein the initial foreground masks are generated on a block basis.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 7. The method of claim 2, wherein said initial foreground masks are generated using a normalized correlation metric.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 8. The method of claim 2, wherein the initial foreground masks are generated using weighting factors that weigh individual frames.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 9. The method of claim 2, wherein a three-level intermediate mask is used when generating foreground masks.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 10. The method of claim 2, wherein morphological operations are used to combine information from the initial foreground masks to generate a single mask for the current frame.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 11. An apparatus for video object segmentation, comprising: a memory and frame alignment mechanism that stores a plurality of frames of video and aligns one or more reference frames with a current frame containing a video object; circuitry that generates an intermediate mask for the current frame based on a neighborhood similarity metric; and a processor that refines the intermediate mask by using information from at least one video frame or mask.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 12. The apparatus of claim 11, further comprising: circuitry that generates initial foreground masks for a current frame with respect to each aligned reference frame based on a neighborhood similarity metric; a generator that combines information from the initial foreground masks to generate an intermediate mask for the current frame before refining the intermediate foreground mask.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 13. The apparatus of claim 11, wherein the processor uses information from at least one video frame or mask that is some combination of spatial and temporal information.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 14. The apparatus of claim 12, wherein the generator combines information using a consensus filtering mechanism.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 15. The apparatus of claim 11, wherein said memory and frame alignment mechanism uses multi-hop homography between frames.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 16. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates masks on a block basis.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 17. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using a normalized correlation metric.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 18. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using weighting factors that weight individual frames.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 19. The apparatus of claim 11, wherein the circuitry that generates initial foreground masks generates them using a three-level intermediate mask.
– Name: Comment
  Label: Claim
  Group: Patent
  Data: 20. The apparatus of claim 11, wherein said processor uses morphological operations to combine information from the foreground masks to generate a single mask for the current frame.
– Name: CodeClass
  Label: Current U.S. Class
  Group: Patent
  Data: 382/173
– Name: CodeClass
  Label: Current International Class
  Group: Patent
  Data: 06
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edspap.20120294530
PLink https://login.libproxy.scu.edu/login?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edspap&AN=edspap.20120294530
RecordInfo BibRecord:
  BibEntity:
    Languages:
      – Text: English
    Titles:
      – TitleFull: METHOD AND APPARATUS FOR VIDEO OBJECT SEGMENTATION
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Bhaskaranand, Malavika
      – PersonEntity:
          Name:
            NameFull: Bhagavathy, Sitaram
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 22
              M: 11
              Text: November 22, 2012
              Type: published
              Y: 2012
ResultId 1